NouRed commited on
Commit
8a66607
·
verified ·
1 Parent(s): 7e421c2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -6
README.md CHANGED
@@ -6,18 +6,18 @@ license: llama3
6
  language:
7
  - en
8
  ---
9
- # Model Card for Model ID
10
- Meta AI released the Llama-3 family of LLMs, composed of two LLMs: Llama-3-8B and Llama-3-70B. These pretrained models can be adapted for a variety of NLG tasks, whereas the instruction fine-tuned version is intended for commercial and research use mainly in the English language.
11
 
12
- Llama-3 is a decoder-only transformer architecture with a 128-token vocabulary and grouped query attention to improve inference efficiency. It has been trained on sequences of 8192 tokens.
13
 
14
  Llama-3 achieved state-of-the-art performance, enhancing capabilities in reasoning, code generation, and instruction following. It is expected to outperform Claude Sonnet, Mistral Medium, and GPT-3.5 on a number of benchmarks.
15
 
16
  ## Model Details
17
- LLMs are trained on large amounts of unstructured data and are great at general text generation. BioMed-Tuned-llama-3-8b addresses some constraints related to using off-the-shelf pre-trained LLMs, especially in the biomedical domain:
18
 
19
- * Efficiently fine-tuned [Llama-3-8b]() on medical instruction Alpaca data encompassing over 54K examples.
20
- * Fine-tuned using QLoRa to further reduce memory usage while maintaining model performance.
21
 
22
  ![finetuning](assets/finetuning.png "LLaMa-3 Fine-Tuning")
23
 
@@ -154,4 +154,13 @@ print(output)
154
  }
155
  ```
156
 
 
 
 
 
 
 
 
 
 
157
  Created with ❤️ by [@NZekaoui](https://twitter.com/NZekaoui)
 
6
  language:
7
  - en
8
  ---
9
+ # BioMed LLaMa-3 8B
10
+ Meta AI released the Llama-3 family of LLMs, composed of two models of the next generation of Llama, Meta Llama 3, available for broad use. This release features pretrained and instruction-fine-tuned language models with 8B and 70B parameters that can support a broad range of use cases.
11
 
12
+ Llama-3 is a decoder-only transformer architecture with a 128K-token vocabulary and grouped query attention to improve inference efficiency. It has been trained on sequences of 8192 tokens.
13
 
14
  Llama-3 achieved state-of-the-art performance, enhancing capabilities in reasoning, code generation, and instruction following. It is expected to outperform Claude Sonnet, Mistral Medium, and GPT-3.5 on a number of benchmarks.
15
 
16
  ## Model Details
17
+ Powerful LLMs are trained on large amounts of unstructured data and are great at general text generation. BioMed-LLaMa-3-8B based on [Llama-3-8b](https://huggingface.co/meta-llama/Meta-Llama-3-8B) addresses some constraints related to using off-the-shelf pre-trained LLMs, especially in the biomedical domain:
18
 
19
+ * Efficiently fine-tuned LLaMa-3-8B on medical instruction Alpaca data, encompassing over 54K instruction-focused examples.
20
+ * Fine-tuned using QLoRa to further reduce memory usage while maintaining model performance and enhancing its capabilities in the biomedical domain.
21
 
22
  ![finetuning](assets/finetuning.png "LLaMa-3 Fine-Tuning")
23
 
 
154
  }
155
  ```
156
 
157
+ ```
158
+ @article{llama3modelcard,
159
+ title={Llama 3 Model Card},
160
+ author={AI@Meta},
161
+ year={2024},
162
+ url = {https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md}
163
+ }
164
+ ```
165
+
166
  Created with ❤️ by [@NZekaoui](https://twitter.com/NZekaoui)