ptrdvn commited on
Commit
f327725
1 Parent(s): b6a178e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -0
README.md CHANGED
@@ -24,6 +24,14 @@ Note that this model has a non-commerical license as we used the Command R and C
24
 
25
  We are currently working on a developing a commerically usable model, so stay tuned for that!
26
 
 
 
 
 
 
 
 
 
27
  # Model results
28
 
29
  We compare the MT-Bench scores across 6 languages for our 4 ORPO trained models, as well as some baselines:
 
24
 
25
  We are currently working on a developing a commerically usable model, so stay tuned for that!
26
 
27
+ # Model list
28
+
29
+ We have ORPO trained the following models using different proportions of the [lightblue/mitsu](https://huggingface.co/datasets/lightblue/mitsu) dataset:
30
+ * Trained on the top/bottom responses of all prompts in the dataset: [lightblue/suzume-llama-3-8B-multilingual-orpo-borda-full](https://huggingface.co/lightblue/suzume-llama-3-8B-multilingual-orpo-borda-full)
31
+ * Trained on the top/bottom responses of the prompts of the 75\% most consistently ranked responses in the dataset: [lightblue/suzume-llama-3-8B-multilingual-orpo-borda-top75](https://huggingface.co/lightblue/suzume-llama-3-8B-multilingual-orpo-borda-top75)
32
+ * Trained on the top/bottom responses of the prompts of the 50\% most consistently ranked responses in the dataset: [lightblue/suzume-llama-3-8B-multilingual-orpo-borda-half](https://huggingface.co/lightblue/suzume-llama-3-8B-multilingual-orpo-borda-half)
33
+ * Trained on the top/bottom responses of the prompts of the 25\% most consistently ranked responses in the dataset: [lightblue/suzume-llama-3-8B-multilingual-orpo-borda-top25](https://huggingface.co/lightblue/suzume-llama-3-8B-multilingual-orpo-borda-top25)
34
+
35
  # Model results
36
 
37
  We compare the MT-Bench scores across 6 languages for our 4 ORPO trained models, as well as some baselines: