bert-base-turkish-uncased-ner
This model is a fine-tuned version of dbmdz/bert-base-turkish-uncased on the turkish-wiki_ner dataset. It achieves the following results on the evaluation set:
- Loss: 0.2603
- F1: 0.7821
Model description
This model is a fine-tuned version of dbmdz/bert-base-turkish-uncased on the turkish-wiki_ner dataset. The training dataset consists of 18,967 samples, and the validation dataset consists of 1,000 samples, both derived from Wikipedia data.
For more detailed information, please visit this link: https://huggingface.co/datasets/turkish-nlp-suite/turkish-wikiNER
Labels:
- CARDINAL
- DATE
- EVENT
- FAC
- GPE
- LANGUAGE
- LAW
- LOC
- MONEY
- NORP
- ORDINAL
- ORG
- PERCENT
- PERSON
- PRODUCT
- QUANTITY
- TIME
- TITLE
- WORK_OF_ART
Fine-Tuning Process : https://github.com/saribasmetehan/bert-base-turkish-uncased-ner
Example
from transformers import pipeline
import pandas as pd
text = "Bu toplam sıfır ise, Newton'ın birinci yasası cismin hareket durumunun değişmeyeceğini söyler."
model_id = "saribasmetehan/bert-base-turkish-uncased-ner"
ner = pipeline("ner",model = model_id)
preds= ner(text, aggregation_strategy = "simple")
pd.DataFrame(preds)
Load model directly
from transformers import AutoModelForTokenClassification, AutoTokenizer
model_name = "saribasmetehan/bert-base-turkish-uncased-ner"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForTokenClassification.from_pretrained(model_name)
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 4
Training results
Training Loss | Epoch | Step | Validation Loss | F1 |
---|---|---|---|---|
0.4 | 1.0 | 1186 | 0.2502 | 0.7703 |
0.2227 | 2.0 | 2372 | 0.2439 | 0.7740 |
0.1738 | 3.0 | 3558 | 0.2511 | 0.7783 |
0.1474 | 4.0 | 4744 | 0.2603 | 0.7821 |
Framework versions
- Transformers 4.41.2
- Pytorch 2.3.0+cu121
- Datasets 2.19.2
- Tokenizers 0.19.1
- Downloads last month
- 15
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for saribasmetehan/bert-base-turkish-uncased-ner
Base model
dbmdz/bert-base-turkish-uncasedEvaluation results
- F1 on turkish-wiki_nervalidation set self-reported0.782