---
license: other
base_model: meta-llama/Meta-Llama-3-8B
tags:
- llama-factory
- full
- generated_from_trainer
model-index:
- name: C013_llama3-8b-base_pretrain_20240428_005832
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# C013_llama3-8b-base_pretrain_20240428_005832

This model is a fine-tuned version of [/mnt/models-pku/progressalign/shared_storage/downloaded_models/llama3-8b-base](https://huggingface.co//mnt/models-pku/progressalign/shared_storage/downloaded_models/llama3-8b-base) on the C013_data dataset.
It achieves the following results on the evaluation set:
- Loss: 1.5943

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 1.5e-05
- train_batch_size: 8
- eval_batch_size: 16
- seed: 42
- distributed_type: multi-GPU
- num_devices: 8
- total_train_batch_size: 64
- total_eval_batch_size: 128
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: polynomial
- lr_scheduler_warmup_steps: 20
- num_epochs: 4.0
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch  | Step | Validation Loss |
|:-------------:|:------:|:----:|:---------------:|
| 1.7594        | 0.0149 | 1    | 1.7163          |
| 1.7333        | 0.0746 | 5    | 1.7008          |
| 1.6854        | 0.1493 | 10   | 1.6825          |
| 1.6897        | 0.2239 | 15   | 1.6701          |
| 1.6656        | 0.2985 | 20   | 1.6651          |
| 1.7254        | 0.3731 | 25   | 1.6679          |
| 1.7178        | 0.4478 | 30   | 1.6542          |
| 1.6656        | 0.5224 | 35   | 1.6459          |
| 1.6647        | 0.5970 | 40   | 1.6308          |
| 1.6645        | 0.6716 | 45   | 1.6205          |
| 1.6151        | 0.7463 | 50   | 1.6129          |
| 1.6359        | 0.8209 | 55   | 1.6052          |
| 1.5885        | 0.8955 | 60   | 1.5995          |
| 1.6142        | 0.9701 | 65   | 1.5943          |
| 1.4875        | 1.0448 | 70   | 1.5963          |
| 1.3844        | 1.1194 | 75   | 1.6118          |
| 1.3555        | 1.1940 | 80   | 1.6069          |
| 1.3597        | 1.2687 | 85   | 1.6040          |
| 1.3737        | 1.3433 | 90   | 1.6071          |
| 1.3492        | 1.4179 | 95   | 1.6074          |
| 1.3826        | 1.4925 | 100  | 1.6055          |
| 1.3533        | 1.5672 | 105  | 1.6035          |
| 1.3611        | 1.6418 | 110  | 1.6023          |
| 1.328         | 1.7164 | 115  | 1.6022          |
| 1.3443        | 1.7910 | 120  | 1.6026          |
| 1.3386        | 1.8657 | 125  | 1.6029          |
| 1.3396        | 1.9403 | 130  | 1.6029          |
| 1.3573        | 2.0149 | 135  | 1.6029          |
| 1.3754        | 2.0896 | 140  | 1.6034          |
| 1.3229        | 2.1642 | 145  | 1.6044          |
| 1.3194        | 2.2388 | 150  | 1.6055          |
| 1.3361        | 2.3134 | 155  | 1.6065          |
| 1.3231        | 2.3881 | 160  | 1.6072          |
| 1.32          | 2.4627 | 165  | 1.6076          |
| 1.3406        | 2.5373 | 170  | 1.6078          |
| 1.3184        | 2.6119 | 175  | 1.6079          |
| 1.2745        | 2.6866 | 180  | 1.6080          |
| 1.3024        | 2.7612 | 185  | 1.6079          |
| 1.3243        | 2.8358 | 190  | 1.6079          |
| 1.3239        | 2.9104 | 195  | 1.6080          |
| 1.3349        | 2.9851 | 200  | 1.6081          |
| 1.337         | 3.0597 | 205  | 1.6079          |
| 1.3091        | 3.1343 | 210  | 1.6078          |
| 1.3266        | 3.2090 | 215  | 1.6079          |
| 1.3014        | 3.2836 | 220  | 1.6083          |
| 1.3153        | 3.3582 | 225  | 1.6086          |
| 1.3192        | 3.4328 | 230  | 1.6090          |
| 1.315         | 3.5075 | 235  | 1.6093          |
| 1.3047        | 3.5821 | 240  | 1.6093          |
| 1.3208        | 3.6567 | 245  | 1.6093          |
| 1.362         | 3.7313 | 250  | 1.6093          |
| 1.3255        | 3.8060 | 255  | 1.6091          |
| 1.2941        | 3.8806 | 260  | 1.6089          |
| 1.3254        | 3.9552 | 265  | 1.6086          |


### Framework versions

- Transformers 4.40.0
- Pytorch 2.1.2+cu121
- Datasets 2.18.0
- Tokenizers 0.19.1