NouRed
/

BioMed-Tuned-Llama-3-8b

Inference Endpoints

Model card Files Files and versions Community

NouRed commited on May 1, 2024

Commit

8a66607

·

verified ·

1 Parent(s): 7e421c2

Update README.md

Files changed (1) hide show

README.md +15 -6

README.md CHANGED Viewed

@@ -6,18 +6,18 @@ license: llama3
 language:
 - en
 ---
-# Model Card for Model ID
-Meta AI released the Llama-3 family of LLMs, composed of two LLMs: Llama-3-8B and Llama-3-70B. These pretrained models can be adapted for a variety of NLG tasks, whereas the instruction fine-tuned version is intended for commercial and research use mainly in the English language.
-Llama-3 is a decoder-only transformer architecture with a 128-token vocabulary and grouped query attention to improve inference efficiency. It has been trained on sequences of 8192 tokens.
 Llama-3 achieved state-of-the-art performance, enhancing capabilities in reasoning, code generation, and instruction following. It is expected to outperform Claude Sonnet, Mistral Medium, and GPT-3.5 on a number of benchmarks.
 ## Model Details
-LLMs are trained on large amounts of unstructured data and are great at general text generation. BioMed-Tuned-llama-3-8b addresses some constraints related to using off-the-shelf pre-trained LLMs, especially in the biomedical domain:
-* Efficiently fine-tuned [Llama-3-8b]() on medical instruction Alpaca data encompassing over 54K examples.
-* Fine-tuned using QLoRa to further reduce memory usage while maintaining model performance.
 ![finetuning](assets/finetuning.png "LLaMa-3 Fine-Tuning")
@@ -154,4 +154,13 @@ print(output)
 }
 ```
 Created with ❤️ by [@NZekaoui](https://twitter.com/NZekaoui)

 language:
 - en
 ---
+# BioMed LLaMa-3 8B
+Meta AI released the Llama-3 family of LLMs, composed of two models of the next generation of Llama, Meta Llama 3, available for broad use. This release features pretrained and instruction-fine-tuned language models with 8B and 70B parameters that can support a broad range of use cases.
+Llama-3 is a decoder-only transformer architecture with a 128K-token vocabulary and grouped query attention to improve inference efficiency. It has been trained on sequences of 8192 tokens.
 Llama-3 achieved state-of-the-art performance, enhancing capabilities in reasoning, code generation, and instruction following. It is expected to outperform Claude Sonnet, Mistral Medium, and GPT-3.5 on a number of benchmarks.
 ## Model Details
+Powerful LLMs are trained on large amounts of unstructured data and are great at general text generation. BioMed-LLaMa-3-8B based on [Llama-3-8b](https://huggingface.co/meta-llama/Meta-Llama-3-8B) addresses some constraints related to using off-the-shelf pre-trained LLMs, especially in the biomedical domain:
+* Efficiently fine-tuned LLaMa-3-8B on medical instruction Alpaca data, encompassing over 54K instruction-focused examples.
+* Fine-tuned using QLoRa to further reduce memory usage while maintaining model performance and enhancing its capabilities in the biomedical domain.
 ![finetuning](assets/finetuning.png "LLaMa-3 Fine-Tuning")
 }
 ```
+```
+@article{llama3modelcard,
+  title={Llama 3 Model Card},
+  author={AI@Meta},
+  year={2024},
+  url = {https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md}
+}
+```
 Created with ❤️ by [@NZekaoui](https://twitter.com/NZekaoui)