natolambert
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -41,6 +41,14 @@ Tülu3 is designed for state-of-the-art performance on a diversity of tasks in a
|
|
41 |
| **Final Models (RLVR)** | [allenai/Llama-3.1-Tulu-3-8B](https://huggingface.co/allenai/Llama-3.1-Tulu-3-8B) | [allenai/Llama-3.1-Tulu-3-70B](https://huggingface.co/allenai/Llama-3.1-Tulu-3-70B) |
|
42 |
| **Reward Model (RM)**| [allenai/Llama-3.1-Tulu-3-8B-RM](https://huggingface.co/allenai/Llama-3.1-Tulu-3-8B-RM) | (Same as 8B) |
|
43 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
44 |
## Using the model
|
45 |
|
46 |
### Loading with HuggingFace
|
|
|
41 |
| **Final Models (RLVR)** | [allenai/Llama-3.1-Tulu-3-8B](https://huggingface.co/allenai/Llama-3.1-Tulu-3-8B) | [allenai/Llama-3.1-Tulu-3-70B](https://huggingface.co/allenai/Llama-3.1-Tulu-3-70B) |
|
42 |
| **Reward Model (RM)**| [allenai/Llama-3.1-Tulu-3-8B-RM](https://huggingface.co/allenai/Llama-3.1-Tulu-3-8B-RM) | (Same as 8B) |
|
43 |
|
44 |
+
|
45 |
+
| **Stage** | **Llama 3.1 405B** |
|
46 |
+
|-----------|-------------------|
|
47 |
+
| **Base Model** | [meta-llama/llama-3.1-405B](https://huggingface.co/meta-llama/llama-3.1-405B) |
|
48 |
+
| **SFT** | [allenai/llama-3.1-Tulu-3-405B-SFT](https://huggingface.co/allenai/llama-3.1-Tulu-3-405B-SFT) |
|
49 |
+
| **Final Model (DPO)** | [allenai/llama-3.1-Tulu-3-405B](https://huggingface.co/allenai/llama-3.1-Tulu-3-405B) |
|
50 |
+
|
51 |
+
|
52 |
## Using the model
|
53 |
|
54 |
### Loading with HuggingFace
|