turkish-sentiment

This model is a fine-tuned version of dbmdz/bert-base-turkish-cased on winvoker/turkish-sentiment-analysis-dataset dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0880
  • Accuracy: 0.9688
  • F1 Macro: 0.9454
  • F1 Weighted: 0.9685
  • Precision: 0.9683
  • Recall: 0.9688

Model description

A BERT-based(dbmdz Turkish BERT) model fine-tuned on a large-scale Turkish sentiment analysis dataset. This model classifies Turkish text into three sentiment classes: Negative, Notr (Neutral), and Positive.

  • Model type: BertForSequenceClassification
  • Base model: dbmdz/bert-base-turkish-cased
  • Language(s): Turkish

Intended uses & limitations

  • Turkish text classification tasks involving sentiment analysis.
  • Suitable for social media data, product reviews, or general-purpose sentiment detection in Turkish.

Usage

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-classification", model="kaixkhazaki/turkish-sentiment")


pipe("Kargo geç geldi ve ürün beklentimi pek karşılamadı.")
>> [{'label': 'Negative', 'score': 0.984860897064209}]

pipe("Yemek lezzetliydi ancak servis yavaş ve çalışanlar ilgisizdi, pek anlayamadım nasıl hissettiğimi.")
>> [{'label': 'Notr', 'score': 0.9881975054740906}]

pipe("Gerçekten müthiş bir deneyimdi, keşke hep burda kalabilsem.")
>> [{'label': 'Positive', 'score': 0.9942901134490967}]

Training and evaluation data

Fine-tuned on a combined dataset with 440,679 training samples and 48,965 validation samples.

Training procedure

Trained on using the entire dataset on a single gpu for apx. 25 mins(1600 steps).

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 64
  • eval_batch_size: 128
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 400
  • training_steps: 1600

Training results

Training Loss Epoch Step Validation Loss Accuracy F1 Macro F1 Weighted Precision Recall
0.3538 0.0581 400 0.1162 0.9582 0.9243 0.9568 0.9572 0.9582
0.1131 0.1162 800 0.1034 0.9639 0.9369 0.9635 0.9633 0.9639
0.1026 0.1743 1200 0.0940 0.9649 0.9411 0.9652 0.9657 0.9649
0.0936 0.2324 1600 0.0880 0.9688 0.9454 0.9685 0.9683 0.9688

Framework versions

  • Transformers 4.48.0.dev0
  • Pytorch 2.4.1+cu121
  • Datasets 3.1.0
  • Tokenizers 0.21.0

Citation

@misc{turkish-sentiment,
  title={Turkish Sentiment Analysis using Turkish BERT},
  author={Fatih Demrici},
  year={2025},
  howpublished={\url{https://huggingface.co/kaixkhazaki/turkish-sentiment}},
}
Downloads last month
11
Safetensors
Model size
111M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for kaixkhazaki/turkish-sentiment

Finetuned
(105)
this model

Dataset used to train kaixkhazaki/turkish-sentiment

Collection including kaixkhazaki/turkish-sentiment