--- base_model: meta-llama/Llama-2-7b-chat-hf tags: - generated_from_trainer model-index: - name: checkpoint-41_9k-lr5em5-bs15n5_r64LA128dt01_2811 results: [] --- # checkpoint-41_9k-lr5em5-bs15n5_r64LA128dt01_2811 This model is a fine-tuned version of [meta-llama/Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 0.8265 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 5e-05 - train_batch_size: 5 - eval_batch_size: 8 - seed: 1 - gradient_accumulation_steps: 3 - total_train_batch_size: 15 - optimizer: Adam with betas=(0.9,0.95) and epsilon=2e-08 - lr_scheduler_type: cosine - lr_scheduler_warmup_steps: 300 - num_epochs: 3 ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:----:|:---------------:| | 0.7717 | 0.16 | 450 | 0.9065 | | 0.7716 | 0.32 | 900 | 0.8863 | | 0.7856 | 0.48 | 1350 | 0.8679 | | 0.7463 | 0.64 | 1800 | 0.8602 | | 0.7298 | 0.81 | 2250 | 0.8531 | | 0.7426 | 0.97 | 2700 | 0.8415 | | 0.6991 | 1.13 | 3150 | 0.8425 | | 0.6923 | 1.29 | 3600 | 0.8382 | | 0.6917 | 1.45 | 4050 | 0.8361 | | 0.6681 | 1.61 | 4500 | 0.8328 | | 0.6764 | 1.77 | 4950 | 0.8297 | | 0.6684 | 1.93 | 5400 | 0.8275 | | 0.6395 | 2.09 | 5850 | 0.8268 | | 0.6735 | 2.25 | 6300 | 0.8272 | | 0.6416 | 2.42 | 6750 | 0.8275 | | 0.651 | 2.58 | 7200 | 0.8265 | | 0.6514 | 2.74 | 7650 | 0.8265 | | 0.6535 | 2.9 | 8100 | 0.8265 | ### Framework versions - Transformers 4.35.2 - Pytorch 2.1.1+cu121 - Datasets 2.4.0 - Tokenizers 0.15.0