pbevan11 commited on
Commit
dd54fd4
1 Parent(s): a228191

End of training

Browse files
Files changed (2) hide show
  1. README.md +13 -8
  2. adapter_model.bin +1 -1
README.md CHANGED
@@ -34,7 +34,7 @@ datasets:
34
  - path: ft_data/alpaca_data.jsonl
35
  type: alpaca
36
  dataset_prepared_path: last_run_prepared
37
- val_set_size: 0.1
38
  output_dir: ./qlora-alpaca-out
39
  hub_model_id: pbevan11/llama-3.1-8b-ocr-correction
40
 
@@ -43,7 +43,6 @@ lora_model_dir:
43
 
44
  sequence_len: 8192
45
  sample_packing: true
46
- eval_sample_packing: false
47
  pad_to_sequence_len: true
48
 
49
  lora_r: 32
@@ -104,12 +103,12 @@ special_tokens:
104
 
105
  </details><br>
106
 
107
- [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/sncds/ocr-ft/runs/1bdo6czg)
108
  # llama-3.1-8b-ocr-correction
109
 
110
  This model is a fine-tuned version of [meta-llama/Meta-Llama-3.1-8B](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B) on the None dataset.
111
  It achieves the following results on the evaluation set:
112
- - Loss: 0.6569
113
 
114
  ## Model description
115
 
@@ -141,10 +140,16 @@ The following hyperparameters were used during training:
141
 
142
  ### Training results
143
 
144
- | Training Loss | Epoch | Step | Validation Loss |
145
- |:-------------:|:-----:|:----:|:---------------:|
146
- | 0.6009 | 0.8 | 1 | 0.6584 |
147
- | 0.5865 | 1.2 | 2 | 0.6569 |
 
 
 
 
 
 
148
 
149
 
150
  ### Framework versions
 
34
  - path: ft_data/alpaca_data.jsonl
35
  type: alpaca
36
  dataset_prepared_path: last_run_prepared
37
+ val_set_size: 0.05
38
  output_dir: ./qlora-alpaca-out
39
  hub_model_id: pbevan11/llama-3.1-8b-ocr-correction
40
 
 
43
 
44
  sequence_len: 8192
45
  sample_packing: true
 
46
  pad_to_sequence_len: true
47
 
48
  lora_r: 32
 
103
 
104
  </details><br>
105
 
106
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/sncds/ocr-ft/runs/rotjhntf)
107
  # llama-3.1-8b-ocr-correction
108
 
109
  This model is a fine-tuned version of [meta-llama/Meta-Llama-3.1-8B](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B) on the None dataset.
110
  It achieves the following results on the evaluation set:
111
+ - Loss: 0.1901
112
 
113
  ## Model description
114
 
 
140
 
141
  ### Training results
142
 
143
+ | Training Loss | Epoch | Step | Validation Loss |
144
+ |:-------------:|:------:|:----:|:---------------:|
145
+ | 0.61 | 0.0331 | 1 | 0.6018 |
146
+ | 0.4379 | 0.2645 | 8 | 0.4256 |
147
+ | 0.2531 | 0.5289 | 16 | 0.2714 |
148
+ | 0.2366 | 0.7934 | 24 | 0.2247 |
149
+ | 0.1839 | 1.0331 | 32 | 0.2053 |
150
+ | 0.1752 | 1.2975 | 40 | 0.1961 |
151
+ | 0.1629 | 1.5620 | 48 | 0.1909 |
152
+ | 0.163 | 1.8264 | 56 | 0.1901 |
153
 
154
 
155
  ### Framework versions
adapter_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:70793268eb8758091b13b71684afe79b089418ec553c562007dcc418159b7a79
3
  size 167934026
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:befe7ee91cb8ab62450880c1dabf645b053b56d4e5b4cf5a4776e29329224eeb
3
  size 167934026