Update README.md
Browse files
README.md
CHANGED
@@ -6,18 +6,18 @@ license: llama3
|
|
6 |
language:
|
7 |
- en
|
8 |
---
|
9 |
-
#
|
10 |
-
Meta AI released the Llama-3 family of LLMs, composed of two
|
11 |
|
12 |
-
Llama-3 is a decoder-only transformer architecture with a
|
13 |
|
14 |
Llama-3 achieved state-of-the-art performance, enhancing capabilities in reasoning, code generation, and instruction following. It is expected to outperform Claude Sonnet, Mistral Medium, and GPT-3.5 on a number of benchmarks.
|
15 |
|
16 |
## Model Details
|
17 |
-
LLMs are trained on large amounts of unstructured data and are great at general text generation. BioMed-
|
18 |
|
19 |
-
* Efficiently fine-tuned
|
20 |
-
* Fine-tuned using QLoRa to further reduce memory usage while maintaining model performance.
|
21 |
|
22 |
![finetuning](assets/finetuning.png "LLaMa-3 Fine-Tuning")
|
23 |
|
@@ -154,4 +154,13 @@ print(output)
|
|
154 |
}
|
155 |
```
|
156 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
157 |
Created with ❤️ by [@NZekaoui](https://twitter.com/NZekaoui)
|
|
|
6 |
language:
|
7 |
- en
|
8 |
---
|
9 |
+
# BioMed LLaMa-3 8B
|
10 |
+
Meta AI released the Llama-3 family of LLMs, composed of two models of the next generation of Llama, Meta Llama 3, available for broad use. This release features pretrained and instruction-fine-tuned language models with 8B and 70B parameters that can support a broad range of use cases.
|
11 |
|
12 |
+
Llama-3 is a decoder-only transformer architecture with a 128K-token vocabulary and grouped query attention to improve inference efficiency. It has been trained on sequences of 8192 tokens.
|
13 |
|
14 |
Llama-3 achieved state-of-the-art performance, enhancing capabilities in reasoning, code generation, and instruction following. It is expected to outperform Claude Sonnet, Mistral Medium, and GPT-3.5 on a number of benchmarks.
|
15 |
|
16 |
## Model Details
|
17 |
+
Powerful LLMs are trained on large amounts of unstructured data and are great at general text generation. BioMed-LLaMa-3-8B based on [Llama-3-8b](https://huggingface.co/meta-llama/Meta-Llama-3-8B) addresses some constraints related to using off-the-shelf pre-trained LLMs, especially in the biomedical domain:
|
18 |
|
19 |
+
* Efficiently fine-tuned LLaMa-3-8B on medical instruction Alpaca data, encompassing over 54K instruction-focused examples.
|
20 |
+
* Fine-tuned using QLoRa to further reduce memory usage while maintaining model performance and enhancing its capabilities in the biomedical domain.
|
21 |
|
22 |
![finetuning](assets/finetuning.png "LLaMa-3 Fine-Tuning")
|
23 |
|
|
|
154 |
}
|
155 |
```
|
156 |
|
157 |
+
```
|
158 |
+
@article{llama3modelcard,
|
159 |
+
title={Llama 3 Model Card},
|
160 |
+
author={AI@Meta},
|
161 |
+
year={2024},
|
162 |
+
url = {https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md}
|
163 |
+
}
|
164 |
+
```
|
165 |
+
|
166 |
Created with ❤️ by [@NZekaoui](https://twitter.com/NZekaoui)
|