Turkish Small Language Models
Collection
6 items
•
Updated
This model is a fine-tuned version of dbmdz/bert-base-turkish-cased on winvoker/turkish-sentiment-analysis-dataset dataset. It achieves the following results on the evaluation set:
A BERT-based(dbmdz Turkish BERT) model fine-tuned on a large-scale Turkish sentiment analysis dataset. This model classifies Turkish text into three sentiment classes: Negative, Notr (Neutral), and Positive.
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-classification", model="kaixkhazaki/turkish-sentiment")
pipe("Kargo geç geldi ve ürün beklentimi pek karşılamadı.")
>> [{'label': 'Negative', 'score': 0.984860897064209}]
pipe("Yemek lezzetliydi ancak servis yavaş ve çalışanlar ilgisizdi, pek anlayamadım nasıl hissettiğimi.")
>> [{'label': 'Notr', 'score': 0.9881975054740906}]
pipe("Gerçekten müthiş bir deneyimdi, keşke hep burda kalabilsem.")
>> [{'label': 'Positive', 'score': 0.9942901134490967}]
Fine-tuned on a combined dataset with 440,679 training samples and 48,965 validation samples.
Trained on using the entire dataset on a single gpu for apx. 25 mins(1600 steps).
The following hyperparameters were used during training:
Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 Macro | F1 Weighted | Precision | Recall |
---|---|---|---|---|---|---|---|---|
0.3538 | 0.0581 | 400 | 0.1162 | 0.9582 | 0.9243 | 0.9568 | 0.9572 | 0.9582 |
0.1131 | 0.1162 | 800 | 0.1034 | 0.9639 | 0.9369 | 0.9635 | 0.9633 | 0.9639 |
0.1026 | 0.1743 | 1200 | 0.0940 | 0.9649 | 0.9411 | 0.9652 | 0.9657 | 0.9649 |
0.0936 | 0.2324 | 1600 | 0.0880 | 0.9688 | 0.9454 | 0.9685 | 0.9683 | 0.9688 |
@misc{turkish-sentiment,
title={Turkish Sentiment Analysis using Turkish BERT},
author={Fatih Demrici},
year={2025},
howpublished={\url{https://huggingface.co/kaixkhazaki/turkish-sentiment}},
}
Base model
dbmdz/bert-base-turkish-cased