---
license: mit
base_model: joeddav/xlm-roberta-large-xnli
tags:
- generated_from_trainer
metrics:
- accuracy
model-index:
- name: xlm-roberta-large-xnli-v2.0
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# xlm-roberta-large-xnli-v2.0

This model is a fine-tuned version of [joeddav/xlm-roberta-large-xnli](https://huggingface.co/joeddav/xlm-roberta-large-xnli) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.3413
- F1 Macro: 0.8779
- F1 Micro: 0.8787
- Accuracy Balanced: 0.8773
- Accuracy: 0.8787
- Precision Macro: 0.8788
- Recall Macro: 0.8773
- Precision Micro: 0.8787
- Recall Micro: 0.8787

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 9e-06
- train_batch_size: 8
- eval_batch_size: 64
- seed: 40
- gradient_accumulation_steps: 4
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.06
- num_epochs: 3

### Training results

| Training Loss | Epoch | Step | Validation Loss | F1 Macro | F1 Micro | Accuracy Balanced | Accuracy | Precision Macro | Recall Macro | Precision Micro | Recall Micro |
|:-------------:|:-----:|:----:|:---------------:|:--------:|:--------:|:-----------------:|:--------:|:---------------:|:------------:|:---------------:|:------------:|
| 0.4814        | 0.17  | 200  | 0.4554          | 0.7851   | 0.7867   | 0.7852            | 0.7867   | 0.7850          | 0.7852       | 0.7867          | 0.7867       |
| 0.4031        | 0.34  | 400  | 0.4020          | 0.8228   | 0.8237   | 0.8235            | 0.8237   | 0.8223          | 0.8235       | 0.8237          | 0.8237       |
| 0.3425        | 0.51  | 600  | 0.3603          | 0.8450   | 0.8454   | 0.8473            | 0.8454   | 0.8448          | 0.8473       | 0.8454          | 0.8454       |
| 0.3374        | 0.68  | 800  | 0.3520          | 0.8518   | 0.8523   | 0.8538            | 0.8523   | 0.8514          | 0.8538       | 0.8523          | 0.8523       |
| 0.326         | 0.85  | 1000 | 0.3386          | 0.8529   | 0.8544   | 0.8521            | 0.8544   | 0.8541          | 0.8521       | 0.8544          | 0.8544       |
| 0.3059        | 1.02  | 1200 | 0.3425          | 0.8643   | 0.8650   | 0.8651            | 0.8650   | 0.8637          | 0.8651       | 0.8650          | 0.8650       |
| 0.2563        | 1.19  | 1400 | 0.3234          | 0.8708   | 0.8719   | 0.8703            | 0.8719   | 0.8713          | 0.8703       | 0.8719          | 0.8719       |
| 0.252         | 1.36  | 1600 | 0.3487          | 0.8580   | 0.8581   | 0.8616            | 0.8581   | 0.8590          | 0.8616       | 0.8581          | 0.8581       |
| 0.2323        | 1.52  | 1800 | 0.3576          | 0.8648   | 0.8666   | 0.8630            | 0.8666   | 0.8681          | 0.8630       | 0.8666          | 0.8666       |
| 0.2669        | 1.69  | 2000 | 0.3888          | 0.8461   | 0.8502   | 0.8425            | 0.8502   | 0.8603          | 0.8425       | 0.8502          | 0.8502       |
| 0.2514        | 1.86  | 2200 | 0.3323          | 0.8742   | 0.8751   | 0.8743            | 0.8751   | 0.8740          | 0.8743       | 0.8751          | 0.8751       |
| 0.1999        | 2.03  | 2400 | 0.3649          | 0.8759   | 0.8767   | 0.8762            | 0.8767   | 0.8755          | 0.8762       | 0.8767          | 0.8767       |
| 0.1764        | 2.2   | 2600 | 0.3889          | 0.8695   | 0.8708   | 0.8685            | 0.8708   | 0.8709          | 0.8685       | 0.8708          | 0.8708       |
| 0.1729        | 2.37  | 2800 | 0.3741          | 0.8676   | 0.8687   | 0.8674            | 0.8687   | 0.8679          | 0.8674       | 0.8687          | 0.8687       |
| 0.159         | 2.54  | 3000 | 0.3844          | 0.8760   | 0.8767   | 0.8772            | 0.8767   | 0.8754          | 0.8772       | 0.8767          | 0.8767       |
| 0.178         | 2.71  | 3200 | 0.3771          | 0.8693   | 0.8708   | 0.8680            | 0.8708   | 0.8714          | 0.8680       | 0.8708          | 0.8708       |
| 0.1893        | 2.88  | 3400 | 0.3678          | 0.8722   | 0.8729   | 0.8730            | 0.8729   | 0.8717          | 0.8730       | 0.8729          | 0.8729       |

### eval result
|Datasets|asadfgglie/nli-zh-tw-all/test|asadfgglie/BanBan_2024-10-17-facial_expressions-nli/test|eval_dataset|test_dataset|
| :---: | :---: | :---: | :---: | :---: |
|eval_loss|0.357|0.261|0.369|0.341|
|eval_f1_macro|0.872|0.919|0.874|0.878|
|eval_f1_micro|0.874|0.919|0.875|0.879|
|eval_accuracy_balanced|0.872|0.919|0.874|0.877|
|eval_accuracy|0.874|0.919|0.875|0.879|
|eval_precision_macro|0.873|0.919|0.874|0.879|
|eval_recall_macro|0.872|0.919|0.874|0.877|
|eval_precision_micro|0.874|0.919|0.875|0.879|
|eval_recall_micro|0.874|0.919|0.875|0.879|
|eval_runtime|50.977|0.625|11.165|44.322|
|eval_samples_per_second|166.741|1514.715|169.192|170.501|
|eval_steps_per_second|2.609|24.018|2.687|2.685|
|Size of dataset|8500|946|1889|7557|

### Framework versions

- Transformers 4.33.3
- Pytorch 2.5.1+cu121
- Datasets 2.14.7
- Tokenizers 0.13.3