Text Generation
Transformers
PyTorch
Safetensors
English
llama
text-generation-inference
Inference Endpoints
chansurgeplus commited on
Commit
da70747
1 Parent(s): ff77815

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -8,7 +8,7 @@ pipeline_tag: text-generation
8
  tags:
9
  - text-generation-inference
10
  ---
11
- # Model Card for OpenBezoar-HH-RLHF-DPO
12
 
13
  The OpenBezoar-HH-RLHF-DPO is an LLM that has been fine tuned for human preferences alignment using Direct Preference Optimization (DPO), on top of [OpenBezoar-HH-RLHF-SFT](https://huggingface.co/SurgeGlobal/OpenBezoar-HH-RLHF-SFT) model on a subset of [Anthropic's HH-RLHF Dataset](https://huggingface.co/datasets/Anthropic/hh-rlhf).
14
 
 
8
  tags:
9
  - text-generation-inference
10
  ---
11
+ ## Summary
12
 
13
  The OpenBezoar-HH-RLHF-DPO is an LLM that has been fine tuned for human preferences alignment using Direct Preference Optimization (DPO), on top of [OpenBezoar-HH-RLHF-SFT](https://huggingface.co/SurgeGlobal/OpenBezoar-HH-RLHF-SFT) model on a subset of [Anthropic's HH-RLHF Dataset](https://huggingface.co/datasets/Anthropic/hh-rlhf).
14