Edit model card

UrduClassification

This model is a fine-tuned version of urduhack/roberta-urdu-small on the imdb_urdu_reviews dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4703

Model Details

  • Model Name: Urdu Sentiment Classification
  • Model Architecture: RobertaForSequenceClassification
  • Base Model: urduhack/roberta-urdu-small
  • Dataset: IMDB Urdu Reviews
  • Task: Sentiment Classification (Positive/Negative)

Training Procedure

The model was fine-tuned using the transformers library and the Trainer class from Hugging Face. The training process involved the following steps:

  1. Tokenization: The input Urdu text was tokenized using the RobertaTokenizerFast from the "urduhack/roberta-urdu-small" pre-trained model. The texts were padded and truncated to a maximum length of 256 tokens.

  2. Model Architecture: The "urduhack/roberta-urdu-small" pre-trained model was loaded as the base model for sequence classification using the RobertaForSequenceClassification class.

  3. Training Arguments: The training arguments were set, including the number of training epochs, batch size, learning rate, evaluation strategy, logging strategy, and more.

  4. Training: The model was trained on the training dataset using the Trainer class. The training process was performed with gradient-based optimization techniques to minimize the cross-entropy loss between predicted and actual sentiment labels.

  5. Evaluation: After each epoch, the model was evaluated on the validation dataset to monitor its performance. The evaluation results, including training loss and validation loss, were logged for analysis.

  6. Fine-Tuning: The model parameters were fine-tuned during the training process to optimize its performance on the IMDb Urdu movie reviews sentiment analysis task.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
0.4078 1.0 2500 0.3954
0.2633 2.0 5000 0.4007
0.1205 3.0 7500 0.4703

Evaluation Results

The model was evaluated on an undisclosed dataset using a language modeling task. The evaluation results after 3 epochs of fine-tuning are as follows:

  • Evaluation Loss: 0.3954
  • Evaluation Runtime: 51.60 seconds
  • Average Samples per Second: 96.89
  • Average Steps per Second: 6.06
  • Epoch: 3.0

Framework versions

  • Transformers 4.30.2
  • Pytorch 2.0.0
  • Datasets 2.1.0
  • Tokenizers 0.13.3
Downloads last month
35
Safetensors
Model size
126M params
Tensor type
I64
·
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for mwz/UrduClassification

Finetuned
(6)
this model

Dataset used to train mwz/UrduClassification