ymcki commited on
Commit
8f4e5be
1 Parent(s): df03492

12 epoches

Browse files
README.md CHANGED
@@ -40,20 +40,28 @@ described by mlabonne.
40
  Layer 17 of the original model was chosen for abliteration.
41
  I also created another layer 18 and 24 abliterated model for comparison.
42
 
43
- ORPO fine tuning was performed for eight epoches. Lowest eval_loss at 7.48 epoch.
44
- Checkpoint at 7.48 epoch was chosen to generate this model.
 
 
45
 
46
  | Epoch | loss | eval_loss | eval_logps/rejected | eval_logps/chosen |
47
  | ----- | ---- | --------- | ------------------- | ----------------- |
48
  | 1.00 | 1.2015 | 1.0501 | -1.0451 | -0.7449 |
49
  | 2.00 | 1.2576 | 1.0145 | -1.1346 | -0.7248 |
50
  | 3.00 | 0.9310 | 0.9958 | -1.2629 | -0.7332 |
 
51
  | 4.00 | 0.8866 | 0.9857 | -1.2231 | -0.7019 |
52
  | 5.00 | 0.8696 | 1.0204 | -1.2242 | -0.7523 |
53
  | 6.00 | 0.9807 | 0.9959 | -1.3093 | -0.7257 |
54
  | 7.00 | 0.3851 | 0.9687 | -1.3826 | -0.7103 |
55
  | 7.48 | 1.2072 | 0.9638 | -1.4512 | -0.6959 |
56
  | 8.00 | 1.4118 | 0.9653 | -1.5047 | -0.6990 |
 
 
 
 
 
57
 
58
  The fine tuned model is uploaded here to be evaluated by the Open LLM Leaderboard to see if the slightly brain damaged non-ORPO model can be healed. Again, the fine tuning method is also based on one described by [mlabonne](https://towardsdatascience.com/fine-tune-llama-3-with-orpo-56cfab2f9ada) but the input model was read into VRAM by [unsloth](https://github.com/unslothai/unsloth) to allow using the full 40k dataset to run on a single 3090.
59
 
@@ -66,6 +74,7 @@ Click on the model name go to the raw score json generated by Open LLM Leaderboa
66
  | [gemma-2-2b-jpn-it](https://huggingface.co/datasets/open-llm-leaderboard/results/blob/main/google/gemma-2-2b-jpn-it/results_2024-10-15T15-21-39.173019.json) | 30.82 | 54.11 | 41.43 | 0.0 | 27.52 | 37.17 | 24.67 |
67
  | [gemma-2-2b-jpn-it-abliterated-17-ORPO (4 epoches)](https://huggingface.co/datasets/open-llm-leaderboard/results/raw/main/ymcki/gemma-2-2b-jpn-it-abliterated-17-ORPO/results_2024-10-20T02-46-59.069357.json) | 29.99 | 50.94 | 38.59 | 2.87 | 27.43 | 38.23 | 21.86 |
68
  | [gemma-2-2b-jpn-it-abliterated-17-ORPO (8 epoches)](https://huggingface.co/datasets/open-llm-leaderboard/results/raw/main/ymcki/gemma-2-2b-jpn-it-abliterated-17-ORPO/results_2024-10-24T00-00-00.000000.json) | 29.42 | 48.95 | 38.27 | 3.17 | 26.93 | 37.43 | 21.77 |
 
69
  | [gemma-2-2b-jpn-it-abliterated-18-ORPO (4 epoches)](https://huggingface.co/datasets/open-llm-leaderboard/results/raw/main/ymcki/gemma-2-2b-jpn-it-abliterated-18-ORPO/results_2024-10-22T04-04-56.385050.json) | 29.94 | 48.97 | 40.18 | 3.02 | 26.17 | 39.42 | 21.85 |
70
  | [gemma-2-2b-jpn-it-abliterated-17](https://huggingface.co/datasets/open-llm-leaderboard/results/raw/main/ymcki/gemma-2-2b-jpn-it-abliterated-17/results_2024-10-18T15-18-46.821674.json) | 30.29 | 52.65 | 40.46 | 0.0 | 27.18 | 36.90 | 24.55 |
71
  | [gemma-2-2b-jpn-it-abliterated-18](https://huggingface.co/datasets/open-llm-leaderboard/results/raw/main/ymcki/gemma-2-2b-jpn-it-abliterated-18/results_2024-10-18T15-41-42.399571.json) | 30.61 | 53.02 | 40.96 | 0.0 | 27.35 | 37.30 | 25.05 |
 
40
  Layer 17 of the original model was chosen for abliteration.
41
  I also created another layer 18 and 24 abliterated model for comparison.
42
 
43
+ ORPO fine tuning was performed for four, eight and twelve epoches. Lowest eval
44
+ at the end of the fourth epoch was at 3.72 epoch. Lowest eval_loss at the
45
+ end of the eighth epoch was 7.48 epoch. Lowest eval_loss at the end of the
46
+ twelve epoch was 11.96 epoch. Checkpoint at 11.96 epoch was chosen to generate this model.
47
 
48
  | Epoch | loss | eval_loss | eval_logps/rejected | eval_logps/chosen |
49
  | ----- | ---- | --------- | ------------------- | ----------------- |
50
  | 1.00 | 1.2015 | 1.0501 | -1.0451 | -0.7449 |
51
  | 2.00 | 1.2576 | 1.0145 | -1.1346 | -0.7248 |
52
  | 3.00 | 0.9310 | 0.9958 | -1.2629 | -0.7332 |
53
+ | 3.72 | 0.7453 | 0.9848 | -1.2205 | -0.7006 |
54
  | 4.00 | 0.8866 | 0.9857 | -1.2231 | -0.7019 |
55
  | 5.00 | 0.8696 | 1.0204 | -1.2242 | -0.7523 |
56
  | 6.00 | 0.9807 | 0.9959 | -1.3093 | -0.7257 |
57
  | 7.00 | 0.3851 | 0.9687 | -1.3826 | -0.7103 |
58
  | 7.48 | 1.2072 | 0.9638 | -1.4512 | -0.6959 |
59
  | 8.00 | 1.4118 | 0.9653 | -1.5047 | -0.6990 |
60
+ | 9.00 | 1.1466 | 1.0070 | -1.6149 | -0.7567 |
61
+ | 10.00 | 1.4646 | 0.9801 | -1.9078 | -0.7207 |
62
+ | 11.00 | 1.8303 | 0.9620 | -2.0278 | -0.7096 |
63
+ | 11.96 | 0.9252 | 0.9372 | -2.0292 | -0.6692 |
64
+ | 12.00 | 1.1489 | 0.9560 | -1.9191 | -0.7226 |
65
 
66
  The fine tuned model is uploaded here to be evaluated by the Open LLM Leaderboard to see if the slightly brain damaged non-ORPO model can be healed. Again, the fine tuning method is also based on one described by [mlabonne](https://towardsdatascience.com/fine-tune-llama-3-with-orpo-56cfab2f9ada) but the input model was read into VRAM by [unsloth](https://github.com/unslothai/unsloth) to allow using the full 40k dataset to run on a single 3090.
67
 
 
74
  | [gemma-2-2b-jpn-it](https://huggingface.co/datasets/open-llm-leaderboard/results/blob/main/google/gemma-2-2b-jpn-it/results_2024-10-15T15-21-39.173019.json) | 30.82 | 54.11 | 41.43 | 0.0 | 27.52 | 37.17 | 24.67 |
75
  | [gemma-2-2b-jpn-it-abliterated-17-ORPO (4 epoches)](https://huggingface.co/datasets/open-llm-leaderboard/results/raw/main/ymcki/gemma-2-2b-jpn-it-abliterated-17-ORPO/results_2024-10-20T02-46-59.069357.json) | 29.99 | 50.94 | 38.59 | 2.87 | 27.43 | 38.23 | 21.86 |
76
  | [gemma-2-2b-jpn-it-abliterated-17-ORPO (8 epoches)](https://huggingface.co/datasets/open-llm-leaderboard/results/raw/main/ymcki/gemma-2-2b-jpn-it-abliterated-17-ORPO/results_2024-10-24T00-00-00.000000.json) | 29.42 | 48.95 | 38.27 | 3.17 | 26.93 | 37.43 | 21.77 |
77
+ | gemma-2-2b-jpn-it-abliterated-17-ORPO (12 epoches) | TBD | TBD | TBD | TBD | TBD | TBD | TBD |
78
  | [gemma-2-2b-jpn-it-abliterated-18-ORPO (4 epoches)](https://huggingface.co/datasets/open-llm-leaderboard/results/raw/main/ymcki/gemma-2-2b-jpn-it-abliterated-18-ORPO/results_2024-10-22T04-04-56.385050.json) | 29.94 | 48.97 | 40.18 | 3.02 | 26.17 | 39.42 | 21.85 |
79
  | [gemma-2-2b-jpn-it-abliterated-17](https://huggingface.co/datasets/open-llm-leaderboard/results/raw/main/ymcki/gemma-2-2b-jpn-it-abliterated-17/results_2024-10-18T15-18-46.821674.json) | 30.29 | 52.65 | 40.46 | 0.0 | 27.18 | 36.90 | 24.55 |
80
  | [gemma-2-2b-jpn-it-abliterated-18](https://huggingface.co/datasets/open-llm-leaderboard/results/raw/main/ymcki/gemma-2-2b-jpn-it-abliterated-18/results_2024-10-18T15-41-42.399571.json) | 30.61 | 53.02 | 40.96 | 0.0 | 27.35 | 37.30 | 25.05 |
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "gemma-2-2b-jpn-it-abliterated-17",
3
  "architectures": [
4
  "Gemma2ForCausalLM"
5
  ],
 
1
  {
2
+ "_name_or_path": "/home/user/gemma-2-2b-jpn-it-abliterated-17",
3
  "architectures": [
4
  "Gemma2ForCausalLM"
5
  ],
model-00001-of-00002.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8890898ad4754962d00ab14a138953e42e22e8fec43330291dad46ec7c7b44c7
3
  size 4988034976
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3be860ac9796f293d6c2ef6bbe6016e3c1a83a2183f32773743f2a5a514b22ad
3
  size 4988034976
model-00002-of-00002.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d886f8d30ac1e1cbe189211fd61c2ed1e92280169f1df9b8e80dbd057cdd123d
3
  size 240691728
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b25fbec9c041eeeb78c4bad9f0b73400dd009efbccc2270ee276f51aaa6cb2dc
3
  size 240691728