fblgit
/

UNA-SimpleSmaug-34b-v1beta

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

fblgit commited on Feb 5

Commit

e1cdc5b

•

1 Parent(s): c422045

Update README.md

Files changed (1) hide show

README.md +26 -2

README.md CHANGED Viewed

@@ -2,9 +2,33 @@
 license: apache-2.0
 datasets:
 - fblgit/simple-math
 tags:
 - UNA
 ---
-So far an experiment, not sure how it went.
-Is based on Smaug and used SimpleMath dataset.

 license: apache-2.0
 datasets:
 - fblgit/simple-math
+base_model: abacusai/Smaug-34B-v0.1
 tags:
 - UNA
+- simple-math
+- juanako
 ---
+# UNA-SimpleSmaug-34b-v1beta
+So far an experiment, not sure how it went. Applied UNA only on the Attention, not on the MLP's
+* Is based on Smaug
+* SimpleMath dataset
+* It was trained on Axolotl
+## Experiment
+The thing here is to understand whats the impact of SimpleMath applied at the attention layer during a SFT session and how it impacts on the neural network overall.
+## Evals
+Pending, but so far this one
+```
+|    Task     |Version| Metric |Value |   |Stderr|
+|-------------|------:|--------|-----:|---|-----:|
+|arc_challenge|      0|acc     |0.7201|±  |0.0131|
+|             |       |acc_norm|0.7457|±  |0.0127|
+```
+Seems to increase GSM and ARC
+## Citations
+To abacusai for making Smaug-34B, the Bagel, and all the magic behind the base model.