alexmarques
commited on
Commit
•
9feb924
1
Parent(s):
692b030
Update README.md
Browse files
README.md
CHANGED
@@ -20,7 +20,7 @@ pipeline_tag: text-generation
|
|
20 |
|
21 |
Compressed version of [Llama-2-7b](https://huggingface.co/meta-llama/Llama-2-7b-hf) specialized for code-generation.
|
22 |
This model was obtained by fine-tuning the Sparse Foundational model [SparseLlama-2-7b-pruned_50.2of4](https://huggingface.co/nm-testing/SparseLlama-2-7b-pruned_50.2of4) on the [evol-codealpaca-v1](https://huggingface.co/datasets/theblackcat102/evol-codealpaca-v1) dataset.
|
23 |
-
[SquareHead](https://arxiv.org/abs/2310.06927) knowledge distillation
|
24 |
It achieves [HumanEval](https://arxiv.org/abs/2107.03374) pass@1 of 34.58%, whereas the dense [Llama-2-7b-evolcodealpaca](https://huggingface.co/neuralmagic/Llama-2-7b-evolcodealpaca) model achieves 32.03%.
|
25 |
|
26 |
This model was produced as part if Neural Magic's Sparse Foundational Models initiative, and demostrates the capability of Sparse Foundational Models to transfer to the code-generation domain.
|
|
|
20 |
|
21 |
Compressed version of [Llama-2-7b](https://huggingface.co/meta-llama/Llama-2-7b-hf) specialized for code-generation.
|
22 |
This model was obtained by fine-tuning the Sparse Foundational model [SparseLlama-2-7b-pruned_50.2of4](https://huggingface.co/nm-testing/SparseLlama-2-7b-pruned_50.2of4) on the [evol-codealpaca-v1](https://huggingface.co/datasets/theblackcat102/evol-codealpaca-v1) dataset.
|
23 |
+
[SquareHead](https://arxiv.org/abs/2310.06927) knowledge distillation was used with [Llama-2-7b-evolcodealpaca](https://huggingface.co/neuralmagic/Llama-2-7b-evolcodealpaca) as teacher.
|
24 |
It achieves [HumanEval](https://arxiv.org/abs/2107.03374) pass@1 of 34.58%, whereas the dense [Llama-2-7b-evolcodealpaca](https://huggingface.co/neuralmagic/Llama-2-7b-evolcodealpaca) model achieves 32.03%.
|
25 |
|
26 |
This model was produced as part if Neural Magic's Sparse Foundational Models initiative, and demostrates the capability of Sparse Foundational Models to transfer to the code-generation domain.
|