End of training
Browse files- README.md +9 -7
- adapter_model.bin +1 -1
README.md
CHANGED
@@ -3,11 +3,12 @@ library_name: peft
|
|
3 |
license: llama3.2
|
4 |
base_model: NousResearch/Llama-3.2-1B
|
5 |
tags:
|
|
|
6 |
- generated_from_trainer
|
7 |
datasets:
|
8 |
- teknium/GPT4-LLM-Cleaned
|
9 |
model-index:
|
10 |
-
- name:
|
11 |
results: []
|
12 |
---
|
13 |
|
@@ -32,6 +33,7 @@ flash_attention: true
|
|
32 |
gradient_accumulation_steps: 2
|
33 |
gradient_checkpointing: true
|
34 |
group_by_length: false
|
|
|
35 |
learning_rate: 0.0002
|
36 |
load_in_4bit: false
|
37 |
load_in_8bit: false
|
@@ -56,7 +58,7 @@ optimizer: adamw_8bit
|
|
56 |
output_dir: /runpod-volume/fine-tuning/test-run
|
57 |
pad_to_sequence_len: true
|
58 |
run_name: test-run
|
59 |
-
runpod_job_id:
|
60 |
sample_packing: true
|
61 |
saves_per_epoch: 1
|
62 |
sequence_len: 2048
|
@@ -76,11 +78,11 @@ weight_decay: 0
|
|
76 |
|
77 |
</details><br>
|
78 |
|
79 |
-
#
|
80 |
|
81 |
This model is a fine-tuned version of [NousResearch/Llama-3.2-1B](https://huggingface.co/NousResearch/Llama-3.2-1B) on the teknium/GPT4-LLM-Cleaned dataset.
|
82 |
It achieves the following results on the evaluation set:
|
83 |
-
- Loss: 1.
|
84 |
|
85 |
## Model description
|
86 |
|
@@ -115,9 +117,9 @@ The following hyperparameters were used during training:
|
|
115 |
| Training Loss | Epoch | Step | Validation Loss |
|
116 |
|:-------------:|:------:|:----:|:---------------:|
|
117 |
| 1.4537 | 0.0009 | 1 | 1.3971 |
|
118 |
-
| 1.
|
119 |
-
| 1.
|
120 |
-
| 1.
|
121 |
|
122 |
|
123 |
### Framework versions
|
|
|
3 |
license: llama3.2
|
4 |
base_model: NousResearch/Llama-3.2-1B
|
5 |
tags:
|
6 |
+
- axolotl
|
7 |
- generated_from_trainer
|
8 |
datasets:
|
9 |
- teknium/GPT4-LLM-Cleaned
|
10 |
model-index:
|
11 |
+
- name: llama-fr-lora
|
12 |
results: []
|
13 |
---
|
14 |
|
|
|
33 |
gradient_accumulation_steps: 2
|
34 |
gradient_checkpointing: true
|
35 |
group_by_length: false
|
36 |
+
hub_model_id: pandyamarut/llama-fr-lora
|
37 |
learning_rate: 0.0002
|
38 |
load_in_4bit: false
|
39 |
load_in_8bit: false
|
|
|
58 |
output_dir: /runpod-volume/fine-tuning/test-run
|
59 |
pad_to_sequence_len: true
|
60 |
run_name: test-run
|
61 |
+
runpod_job_id: b7693c20-f1ab-4572-ad4f-bf19fa790d82-u1
|
62 |
sample_packing: true
|
63 |
saves_per_epoch: 1
|
64 |
sequence_len: 2048
|
|
|
78 |
|
79 |
</details><br>
|
80 |
|
81 |
+
# llama-fr-lora
|
82 |
|
83 |
This model is a fine-tuned version of [NousResearch/Llama-3.2-1B](https://huggingface.co/NousResearch/Llama-3.2-1B) on the teknium/GPT4-LLM-Cleaned dataset.
|
84 |
It achieves the following results on the evaluation set:
|
85 |
+
- Loss: 1.1014
|
86 |
|
87 |
## Model description
|
88 |
|
|
|
117 |
| Training Loss | Epoch | Step | Validation Loss |
|
118 |
|:-------------:|:------:|:----:|:---------------:|
|
119 |
| 1.4537 | 0.0009 | 1 | 1.3971 |
|
120 |
+
| 1.1953 | 0.2503 | 271 | 1.1562 |
|
121 |
+
| 1.1678 | 0.5007 | 542 | 1.1135 |
|
122 |
+
| 1.1912 | 0.7510 | 813 | 1.1014 |
|
123 |
|
124 |
|
125 |
### Framework versions
|
adapter_model.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 45169354
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:79b8de8c3045ce62cac19cefbf984c4d2db1a67dd6e142abdce6e355a504e44b
|
3 |
size 45169354
|