yuansui
/

llama2_7b_instruct_sft_dpo

Text Generation

alignment-handbook

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

yuansui commited on Aug 25

Commit

dc2e854

•

1 Parent(s): e5879d7

End of training

Files changed (2) hide show

README.md +7 -3
config.json +1 -1

README.md CHANGED Viewed

@@ -1,21 +1,25 @@
 ---
 library_name: transformers
 tags:
 - trl
 - dpo
 - alignment-handbook
 - generated_from_trainer
 model-index:
-- name: llama2_7b_instruct_sft_dpo
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-# llama2_7b_instruct_sft_dpo
-This model was trained from scratch on an unknown dataset.
 ## Model description

 ---
 library_name: transformers
+base_model: llama-2-7b-instruct-sft
 tags:
+- alignment-handbook
 - trl
 - dpo
 - alignment-handbook
 - generated_from_trainer
+datasets:
+- xinlai/Math-Step-DPO-10K
 model-index:
+- name: llama-2-7b-instruct-sft
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+# llama-2-7b-instruct-sft
+This model is a fine-tuned version of [llama-2-7b-instruct-sft](https://huggingface.co/llama-2-7b-instruct-sft) on the xinlai/Math-Step-DPO-10K dataset.
 ## Model description

config.json CHANGED Viewed

@@ -24,6 +24,6 @@
   "tie_word_embeddings": false,
   "torch_dtype": "bfloat16",
   "transformers_version": "4.44.2",
-  "use_cache": false,
   "vocab_size": 32000
 }

   "tie_word_embeddings": false,
   "torch_dtype": "bfloat16",
   "transformers_version": "4.44.2",
+  "use_cache": true,
   "vocab_size": 32000
 }