pbevan11
/

llama-3.1-8b-ocr-correction

Generated from Trainer

4-bit precision

Model card Files Files and versions Community

pbevan11 commited on Jul 26

Commit

dd54fd4

•

1 Parent(s): a228191

End of training

Files changed (2) hide show

README.md +13 -8
adapter_model.bin +1 -1

README.md CHANGED Viewed

@@ -34,7 +34,7 @@ datasets:
   - path: ft_data/alpaca_data.jsonl
     type: alpaca
 dataset_prepared_path: last_run_prepared
-val_set_size: 0.1
 output_dir: ./qlora-alpaca-out
 hub_model_id: pbevan11/llama-3.1-8b-ocr-correction
@@ -43,7 +43,6 @@ lora_model_dir:
 sequence_len: 8192
 sample_packing: true
-eval_sample_packing: false
 pad_to_sequence_len: true
 lora_r: 32
@@ -104,12 +103,12 @@ special_tokens:
 </details><br>
-[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/sncds/ocr-ft/runs/1bdo6czg)
 # llama-3.1-8b-ocr-correction
 This model is a fine-tuned version of [meta-llama/Meta-Llama-3.1-8B](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.6569
 ## Model description
@@ -141,10 +140,16 @@ The following hyperparameters were used during training:
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss |
-|:-------------:|:-----:|:----:|:---------------:|
-| 0.6009        | 0.8   | 1    | 0.6584          |
-| 0.5865        | 1.2   | 2    | 0.6569          |
 ### Framework versions

   - path: ft_data/alpaca_data.jsonl
     type: alpaca
 dataset_prepared_path: last_run_prepared
+val_set_size: 0.05
 output_dir: ./qlora-alpaca-out
 hub_model_id: pbevan11/llama-3.1-8b-ocr-correction
 sequence_len: 8192
 sample_packing: true
 pad_to_sequence_len: true
 lora_r: 32
 </details><br>
+[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/sncds/ocr-ft/runs/rotjhntf)
 # llama-3.1-8b-ocr-correction
 This model is a fine-tuned version of [meta-llama/Meta-Llama-3.1-8B](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.1901
 ## Model description
 ### Training results
+| Training Loss | Epoch  | Step | Validation Loss |
+|:-------------:|:------:|:----:|:---------------:|
+| 0.61          | 0.0331 | 1    | 0.6018          |
+| 0.4379        | 0.2645 | 8    | 0.4256          |
+| 0.2531        | 0.5289 | 16   | 0.2714          |
+| 0.2366        | 0.7934 | 24   | 0.2247          |
+| 0.1839        | 1.0331 | 32   | 0.2053          |
+| 0.1752        | 1.2975 | 40   | 0.1961          |
+| 0.1629        | 1.5620 | 48   | 0.1909          |
+| 0.163         | 1.8264 | 56   | 0.1901          |
 ### Framework versions

adapter_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:70793268eb8758091b13b71684afe79b089418ec553c562007dcc418159b7a79
 size 167934026

 version https://git-lfs.github.com/spec/v1
+oid sha256:befe7ee91cb8ab62450880c1dabf645b053b56d4e5b4cf5a4776e29329224eeb
 size 167934026