Update README.md
Browse files
README.md
CHANGED
@@ -107,15 +107,15 @@ This model is released under CC BY-NC 4.0 license. For commercial usage inquirie
|
|
107 |
|
108 |
## Appendix
|
109 |
|
110 |
-
KMMLU scores comparison chart:
|
111 |
<img src="assets/comparison-chart.png" width="100%" style="margin: 40px auto;">
|
112 |
|
113 |
-
DNA 1.0 8B Instruct model architecture <sup>[1]</sup>:
|
114 |
<img src="assets/model-architecture.png" width="500" style="margin: 40px auto;">
|
115 |
|
116 |
[1]: <https://www.linkedin.com/posts/sebastianraschka_the-llama-32-1b-and-3b-models-are-my-favorite-activity-7248317830943686656-yyYD/>
|
117 |
|
118 |
-
The median percentage of model’s weight difference between before and after the merge (our SFT model + Llama 3.1 8B Instruct):
|
119 |
<img src="assets/ours-vs-merged.png" width="100%" style="margin: 40px auto;">
|
120 |
|
121 |
## Citation
|
|
|
107 |
|
108 |
## Appendix
|
109 |
|
110 |
+
- KMMLU scores comparison chart:
|
111 |
<img src="assets/comparison-chart.png" width="100%" style="margin: 40px auto;">
|
112 |
|
113 |
+
- DNA 1.0 8B Instruct model architecture <sup>[1]</sup>:
|
114 |
<img src="assets/model-architecture.png" width="500" style="margin: 40px auto;">
|
115 |
|
116 |
[1]: <https://www.linkedin.com/posts/sebastianraschka_the-llama-32-1b-and-3b-models-are-my-favorite-activity-7248317830943686656-yyYD/>
|
117 |
|
118 |
+
- The median percentage of model’s weight difference between before and after the merge (our SFT model + Llama 3.1 8B Instruct):
|
119 |
<img src="assets/ours-vs-merged.png" width="100%" style="margin: 40px auto;">
|
120 |
|
121 |
## Citation
|