Anthropic/hh-rlhf
Viewer • Updated • 169k • 39k • 1.74k
How to use Leogrin/eleuther-pythia1b-hh-sft with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="Leogrin/eleuther-pythia1b-hh-sft") # Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("Leogrin/eleuther-pythia1b-hh-sft")
model = AutoModelForCausalLM.from_pretrained("Leogrin/eleuther-pythia1b-hh-sft")How to use Leogrin/eleuther-pythia1b-hh-sft with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Leogrin/eleuther-pythia1b-hh-sft"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "Leogrin/eleuther-pythia1b-hh-sft",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker model run hf.co/Leogrin/eleuther-pythia1b-hh-sft
How to use Leogrin/eleuther-pythia1b-hh-sft with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "Leogrin/eleuther-pythia1b-hh-sft" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "Leogrin/eleuther-pythia1b-hh-sft",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "Leogrin/eleuther-pythia1b-hh-sft" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "Leogrin/eleuther-pythia1b-hh-sft",
"prompt": "Once upon a time,",
"max_tokens": 512,
"temperature": 0.5
}'How to use Leogrin/eleuther-pythia1b-hh-sft with Docker Model Runner:
docker model run hf.co/Leogrin/eleuther-pythia1b-hh-sft
Pythia-1b supervised finetuned with Anthropic-hh-rlhf dataset for 1 epoch.
See Pythia-1b for model details (paper).
Results for the base model are taken from the Pythia paper.
| Task | 1B_base | 1B_sft |
|---|---|---|
| Lambada (OpenAI) | 0.562 ± 0.007 | 0.563 ± 0.007 |
| PIQA | 0.707 ± 0.011 | 0.711 ± 0.011 |
| WinoGrande | 0.537 ± 0.014 | 0.534 ± 0.014 |
| WSC | 0.365 ± 0.047 | 0.365 ± 0.047 |
| ARC - Easy | 0.569 ± 0.010 | 0.583 ± 0.010 |
| ARC - Challenge | 0.244 ± 0.013 | 0.248 ± 0.013 |
| SciQ | 0.840 ± 0.012 | 0.847 ± 0.011 |
| LogiQA | 0.223 ± 0.016 | -- |
| Task | 1B_base | 1B_sft |
|---|---|---|
| Lambada (OpenAI) | 0.507 ± 0.007 | 0.4722 ± 0.007 |
| PIQA | 0.705 ± 0.011 | 0.7165 ± 0.0105 |
| WinoGrande | 0.532 ± 0.014 | 0.5343 ± 0.014 |
| WSC | 0.365 ± 0.047 | 0.5000 ± 0.0493 |
| ARC - Easy | 0.594 ± 0.010 | 0.6010 ± 0.010 |
| ARC - Challenge | 0.259 ± 0.013 | 0.2679 ± 0.0129 |
| SciQ | 0.920 ± 0.009 | 0.9100 ± 0.0091 |
| LogiQA | 0.227 ± 0.016 | N/A |