Fine-tuning
- this model was trained to classify whether input text comes from "chosen sentence" or "rejected sentence"
- the probability (logits after passing softmax function) in last layer of this model can be used to quantify the preference from user input
- fine-tuned studio-ousia/mluke-large-lite via full parameter tuning using open-preference-v0.3
- trained on bf16 format
- Label 0 stands for rejected sentence
- Label 1 stands for chosen sentence
- Note that this model can handle only 512 tokens in maximum
- The limitation arises from Luke-based pre-trained model
Metric
- train and validation split
train loss | eval loss | accuracy | recall | precision | f1-score |
---|---|---|---|---|---|
0.1427 | 0.2009 | 9282 | 0.9383 | 0.9198 | 0.9290 |
- test split
accuracy | recall | precision | f1-score |
---|---|---|---|
0.9310 | 0.9199 | 0.9408 | 0.9302 |
- confusion matrix when test split
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 32
- eval_batch_size: 32
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision | Recall | F1 |
---|---|---|---|---|---|---|---|
0.316 | 1.0 | 1479 | 0.2245 | 0.9127 | 0.9027 | 0.9251 | 0.9138 |
0.1696 | 2.0 | 2958 | 0.1869 | 0.9308 | 0.9234 | 0.9395 | 0.9314 |
0.1427 | 3.0 | 4437 | 0.2009 | 0.9283 | 0.9198 | 0.9384 | 0.9290 |
Framework versions
- Transformers 4.42.3
- Pytorch 2.1.0+cu118
- Datasets 2.20.0
- Tokenizers 0.19.1
- Downloads last month
- 2
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for ryota39/luke-japanese-base-lite-reward
Base model
studio-ousia/luke-japanese-base-lite