metadata
base_model: meta-llama/Llama-2-7b-chat-hf
tags:
- generated_from_trainer
model-index:
- name: checkpoint-41_9k-lr5em5-bs15n5_r64LA128dt01_2811
results: []
checkpoint-41_9k-lr5em5-bs15n5_r64LA128dt01_2811
This model is a fine-tuned version of meta-llama/Llama-2-7b-chat-hf on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.8265
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 5
- eval_batch_size: 8
- seed: 1
- gradient_accumulation_steps: 3
- total_train_batch_size: 15
- optimizer: Adam with betas=(0.9,0.95) and epsilon=2e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 300
- num_epochs: 3
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
0.7717 | 0.16 | 450 | 0.9065 |
0.7716 | 0.32 | 900 | 0.8863 |
0.7856 | 0.48 | 1350 | 0.8679 |
0.7463 | 0.64 | 1800 | 0.8602 |
0.7298 | 0.81 | 2250 | 0.8531 |
0.7426 | 0.97 | 2700 | 0.8415 |
0.6991 | 1.13 | 3150 | 0.8425 |
0.6923 | 1.29 | 3600 | 0.8382 |
0.6917 | 1.45 | 4050 | 0.8361 |
0.6681 | 1.61 | 4500 | 0.8328 |
0.6764 | 1.77 | 4950 | 0.8297 |
0.6684 | 1.93 | 5400 | 0.8275 |
0.6395 | 2.09 | 5850 | 0.8268 |
0.6735 | 2.25 | 6300 | 0.8272 |
0.6416 | 2.42 | 6750 | 0.8275 |
0.651 | 2.58 | 7200 | 0.8265 |
0.6514 | 2.74 | 7650 | 0.8265 |
0.6535 | 2.9 | 8100 | 0.8265 |
Framework versions
- Transformers 4.35.2
- Pytorch 2.1.1+cu121
- Datasets 2.4.0
- Tokenizers 0.15.0