noeloco commited on
Commit
eb3ccb7
1 Parent(s): d40fb57

End of training

Browse files
Files changed (2) hide show
  1. README.md +23 -19
  2. adapter_model.bin +2 -2
README.md CHANGED
@@ -25,8 +25,8 @@ is_llama_derived_model: true
25
 
26
  hub_model_id: noeloco/camel-lora
27
 
28
- load_in_8bit: true
29
- load_in_4bit: false
30
  strict: false
31
 
32
  datasets:
@@ -44,7 +44,7 @@ sequence_len: 2048
44
  sample_packing: false
45
  pad_to_sequence_len: true
46
 
47
- adapter: lora
48
  lora_model_dir:
49
  lora_r: 16
50
  lora_alpha: 8
@@ -58,9 +58,9 @@ wandb_watch:
58
  wandb_name:
59
  wandb_log_model:
60
 
61
- gradient_accumulation_steps: 1
62
  micro_batch_size: 2
63
- num_epochs: 3
64
  optimizer: paged_adamw_32bit
65
  lr_scheduler: cosine
66
  learning_rate: 0.0002
@@ -100,7 +100,7 @@ special_tokens:
100
 
101
  This model is a fine-tuned version of [codellama/CodeLlama-7b-hf](https://huggingface.co/codellama/CodeLlama-7b-hf) on an unknown dataset.
102
  It achieves the following results on the evaluation set:
103
- - Loss: 0.0194
104
 
105
  ## Model description
106
 
@@ -123,27 +123,31 @@ The following hyperparameters were used during training:
123
  - train_batch_size: 2
124
  - eval_batch_size: 2
125
  - seed: 42
 
 
126
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
127
  - lr_scheduler_type: cosine
128
  - lr_scheduler_warmup_steps: 10
129
- - num_epochs: 3
130
 
131
  ### Training results
132
 
133
  | Training Loss | Epoch | Step | Validation Loss |
134
  |:-------------:|:-----:|:----:|:---------------:|
135
- | 1.9757 | 0.01 | 1 | 2.5058 |
136
- | 0.6029 | 0.26 | 18 | 0.8441 |
137
- | 0.3298 | 0.51 | 36 | 0.2600 |
138
- | 0.0939 | 0.77 | 54 | 0.1288 |
139
- | 0.0961 | 1.03 | 72 | 0.1025 |
140
- | 0.0641 | 1.29 | 90 | 0.0430 |
141
- | 0.0639 | 1.54 | 108 | 0.0405 |
142
- | 0.2106 | 1.8 | 126 | 0.0206 |
143
- | 0.0558 | 2.06 | 144 | 0.0349 |
144
- | 0.0407 | 2.31 | 162 | 0.0298 |
145
- | 0.0493 | 2.57 | 180 | 0.0237 |
146
- | 0.0915 | 2.83 | 198 | 0.0194 |
 
 
147
 
148
 
149
  ### Framework versions
 
25
 
26
  hub_model_id: noeloco/camel-lora
27
 
28
+ load_in_8bit: false
29
+ load_in_4bit: true
30
  strict: false
31
 
32
  datasets:
 
44
  sample_packing: false
45
  pad_to_sequence_len: true
46
 
47
+ adapter: qlora
48
  lora_model_dir:
49
  lora_r: 16
50
  lora_alpha: 8
 
58
  wandb_name:
59
  wandb_log_model:
60
 
61
+ gradient_accumulation_steps: 4
62
  micro_batch_size: 2
63
+ num_epochs: 4
64
  optimizer: paged_adamw_32bit
65
  lr_scheduler: cosine
66
  learning_rate: 0.0002
 
100
 
101
  This model is a fine-tuned version of [codellama/CodeLlama-7b-hf](https://huggingface.co/codellama/CodeLlama-7b-hf) on an unknown dataset.
102
  It achieves the following results on the evaluation set:
103
+ - Loss: 0.0402
104
 
105
  ## Model description
106
 
 
123
  - train_batch_size: 2
124
  - eval_batch_size: 2
125
  - seed: 42
126
+ - gradient_accumulation_steps: 4
127
+ - total_train_batch_size: 8
128
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
129
  - lr_scheduler_type: cosine
130
  - lr_scheduler_warmup_steps: 10
131
+ - num_epochs: 4
132
 
133
  ### Training results
134
 
135
  | Training Loss | Epoch | Step | Validation Loss |
136
  |:-------------:|:-----:|:----:|:---------------:|
137
+ | 1.7705 | 0.06 | 1 | 2.5549 |
138
+ | 1.89 | 0.29 | 5 | 2.5346 |
139
+ | 1.48 | 0.57 | 10 | 1.9766 |
140
+ | 0.7709 | 0.86 | 15 | 1.0579 |
141
+ | 0.5576 | 1.14 | 20 | 0.5837 |
142
+ | 0.2286 | 1.43 | 25 | 0.3510 |
143
+ | 0.3504 | 1.71 | 30 | 0.1531 |
144
+ | 0.228 | 2.0 | 35 | 0.1109 |
145
+ | 0.1202 | 2.29 | 40 | 0.0935 |
146
+ | 0.1138 | 2.57 | 45 | 0.0612 |
147
+ | 0.1098 | 2.86 | 50 | 0.0498 |
148
+ | 0.134 | 3.14 | 55 | 0.0430 |
149
+ | 0.1015 | 3.43 | 60 | 0.0401 |
150
+ | 0.0668 | 3.71 | 65 | 0.0402 |
151
 
152
 
153
  ### Framework versions
adapter_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8e687c7c1ccc2f6c5257babe05a9ccbaa36d09282566e5a5c5785d479eb7cccc
3
- size 160069834
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:77a3b8b477fbc82e5b338aea095041121462f5a56553a846d04f6dc0f5d67161
3
+ size 80115914