hamishivi commited on
Commit
0f5f365
·
verified ·
1 Parent(s): 64a272d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -4
README.md CHANGED
@@ -40,13 +40,16 @@ For more details on the training mixture, read the paper: [Camels in a Changing
40
  |-|-|-|-|-|-|-|-|-|-|-|-|-|
41
  | Llama 3 8b base | 0.649 | 0.565 | 0.653 | 66.80 | 0.664 | - | - | 0.299 | 0.146 | 0.200 | 0.390 | 54.36 |
42
  | Llama 3 8b instruct | 0.626 | 0.770 | 0.606 | 59.04 | 0.799 | 94.65 | 23.12 | 0.682 | 0.741 | 0.028 | 0.115 | 70.36 |
43
- | Llama 3 Tulu 2 8b | 0.606 | 0.610 | 0.592 | 56.24 | 0.685 | 79.40 | 10.16 | 0.503 | 0.468 | 0.092 | 0.165 | 59.39 |
44
- | Llama 3 Tulu 2+DPO 8b | 0.609 | 0.650 | 0.584 | 21.18 | 0.688 | 93.02 | 13.94 | 0.698 | 0.518 | 0.092 | 0.165 | 59.61 |
45
  | Llama 3 70b base | 0.790 | 0.840 | 0.801 | 73.35 | 0.745 | - | - | 0.469 | 0.163 | 0.256 | 0.330 | 65.60 |
46
  | Llama 3 70b instruct | 0.786 | 0.930 | 0.801 | 59.21 | 0.908 | 96.71 | 39.99 | 0.701 | 0.828 | 0.060 | 0.140 | 79.22 |
47
- | **Llama 3 Tulu 2 70b (this model)** | 0.752 | 0.845 | 0.779 | 69.798 | 0.861 | 86.007 | 17.51 | 0.646 | 0.591 | 0.108 | 0.130 | 73.01 |
48
- | Llama 3 Tulu 2+DPO 70b | 0.754 | 0.860 | 0.785 | 23.443 | 0.878 | 96.65 | 27.34 | 0.780 | 0.643 | 0.080 | 0.140 | 71.60 |
49
 
 
 
 
50
 
51
  ## Input Format
52
 
 
40
  |-|-|-|-|-|-|-|-|-|-|-|-|-|
41
  | Llama 3 8b base | 0.649 | 0.565 | 0.653 | 66.80 | 0.664 | - | - | 0.299 | 0.146 | 0.200 | 0.390 | 54.36 |
42
  | Llama 3 8b instruct | 0.626 | 0.770 | 0.606 | 59.04 | 0.799 | 94.65 | 23.12 | 0.682 | 0.741 | 0.028 | 0.115 | 70.36 |
43
+ | [Llama 3 Tulu 2 8b](https://huggingface.co/allenai/llama-3-tulu-2-8b) | 0.606 | 0.610 | 0.592 | 56.24 | 0.685 | 79.40 | 10.16 | 0.503 | 0.468 | 0.092 | 0.165 | 59.39 |
44
+ | [Llama 3 Tulu 2+DPO 8b](https://huggingface.co/allenai/llama-3-tulu-2-dpo-8b) | 0.609 | 0.650 | 0.584 | 21.18 | 0.688 | 93.02 | 13.94 | 0.698 | 0.518 | 0.092 | 0.165 | 59.61 |
45
  | Llama 3 70b base | 0.790 | 0.840 | 0.801 | 73.35 | 0.745 | - | - | 0.469 | 0.163 | 0.256 | 0.330 | 65.60 |
46
  | Llama 3 70b instruct | 0.786 | 0.930 | 0.801 | 59.21 | 0.908 | 96.71 | 39.99 | 0.701 | 0.828 | 0.060 | 0.140 | 79.22 |
47
+ | **[Llama 3 Tulu 2 70b](https://huggingface.co/allenai/llama-3-tulu-2-70b) (this model)** | 0.752 | 0.845 | 0.779 | 69.798 | 0.861 | 86.007 | 17.51 | 0.646 | 0.591 | 0.108 | 0.130 | 73.01 |
48
+ | [Llama 3 Tulu 2+DPO 70b](https://huggingface.co/allenai/llama-3-tulu-2-dpo-70b) | 0.754 | 0.860 | 0.785 | 23.443 | 0.878 | 96.65 | 27.34 | 0.780 | 0.643 | 0.080 | 0.140 | 71.60 |
49
 
50
+ We also release reward models based off Llama 3 8b and 70b respectively:
51
+ - [Llama 3 Tulu 2 8b UltraFeedback RM](https://huggingface.co/allenai/llama-3-tulu-2-8b-uf-mean-rm)
52
+ - [Llama 3 Tulu 2 70b UltraFeedback RM](https://huggingface.co/allenai/llama-3-tulu-2-70b-uf-mean-rm)
53
 
54
  ## Input Format
55