Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
sfulay
/
zephyr-7b-dpo-full-hh
like
0
Text Generation
Transformers
Safetensors
Anthropic/hh-rlhf
mistral
alignment-handbook
trl
dpo
Generated from Trainer
conversational
text-generation-inference
Inference Endpoints
License:
apache-2.0
Model card
Files
Files and versions
Community
Train
Deploy
Use this model
418412b
zephyr-7b-dpo-full-hh
Commit History
Model save
418412b
verified
sfulay
commited on
Jul 12
Training in progress, step 1200
dd2b472
verified
sfulay
commited on
Jul 12
Training in progress, step 1100
85592ea
verified
sfulay
commited on
Jul 12
Training in progress, step 1000
aec981a
verified
sfulay
commited on
Jul 12
Training in progress, step 900
0067876
verified
sfulay
commited on
Jul 12
Training in progress, step 800
7c8c7f0
verified
sfulay
commited on
Jul 12
Training in progress, step 700
4064a22
verified
sfulay
commited on
Jul 12
Training in progress, step 600
0f9de06
verified
sfulay
commited on
Jul 12
Training in progress, step 500
1551f0f
verified
sfulay
commited on
Jul 12
Training in progress, step 400
334c7d6
verified
sfulay
commited on
Jul 12
Training in progress, step 300
779cad0
verified
sfulay
commited on
Jul 12
Training in progress, step 200
3dec666
verified
sfulay
commited on
Jul 11
Training in progress, step 100
9e38ad6
verified
sfulay
commited on
Jul 11
initial commit
ec5bb77
verified
sfulay
commited on
Jul 11