princeton-nlp
/

gemma-2-9b-it-SimPO

Text Generation

alignment-handbook

Generated from Trainer

text-generation-inference

Model card Files Files and versions

princeton-nlp commited on Jul 16, 2024

Commit

ebdb01f

·

verified ·

1 Parent(s): 088ed5d

Update README.md

Files changed (1) hide show

README.md +2 -7

README.md CHANGED Viewed

@@ -12,9 +12,7 @@ SimPO (Simple Preference Optimization) is an offline preference optimization alg
 ### Model Description
-We fine-tuned [google/gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it) on with the SimPO objective.
-, a preference optimization dataset where the prompts are from [HuggingFaceH4/ultrafeedback_binarized](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized)
 - **Developed by:** Yu Meng, Mengzhou Xia, Danqi Chen
 - **Model type:** Causal Language Model
@@ -34,8 +32,6 @@ We fine-tuned [google/gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it
 ```
 import torch
 from transformers import pipeline
-import json
-import warnings
 model_id = "princeton-nlp/gemma-2-9b-it-SimPO"
@@ -45,7 +41,6 @@ generator = pipeline(
     model_kwargs={"torch_dtype": torch.bfloat16},
     device="cuda",
 )
-generator.tokenizer.chat_template = template
 outputs = generator([{"role": "user", "content": "What's the difference between llamas and alpacas?"}], do_sample=False, max_new_tokens=200)
 print(outputs[0]['generated_text'])
 ```
@@ -62,7 +57,7 @@ We use
 #### Speeds, Sizes, Times
-Fine-tuning the [google/gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it) on  takes around 100 mins to finish on 8xH100 GPUs.
 ## Evaluation

 ### Model Description
+We fine-tuned [google/gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it) on [princeton-nlp/gemma2-ultrafeedback-armorm](https://huggingface.co/datasets/princeton-nlp/gemma2-ultrafeedback-armorm) with the SimPO objective.
 - **Developed by:** Yu Meng, Mengzhou Xia, Danqi Chen
 - **Model type:** Causal Language Model
 ```
 import torch
 from transformers import pipeline
 model_id = "princeton-nlp/gemma-2-9b-it-SimPO"
     model_kwargs={"torch_dtype": torch.bfloat16},
     device="cuda",
 )
 outputs = generator([{"role": "user", "content": "What's the difference between llamas and alpacas?"}], do_sample=False, max_new_tokens=200)
 print(outputs[0]['generated_text'])
 ```
 #### Speeds, Sizes, Times
+Fine-tuning the [google/gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it) on [princeton-nlp/gemma2-ultrafeedback-armorm](https://huggingface.co/datasets/princeton-nlp/gemma2-ultrafeedback-armorm) takes around 100 mins to finish on 8xH100 GPUs.
 ## Evaluation