openchat
/

openchat_3.5

Text Generation

text-generation-inference

Model card Files Files and versions

Instructions to use openchat/openchat_3.5 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use openchat/openchat_3.5 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="openchat/openchat_3.5")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("openchat/openchat_3.5")
model = AutoModelForCausalLM.from_pretrained("openchat/openchat_3.5")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps

How to use openchat/openchat_3.5 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "openchat/openchat_3.5"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "openchat/openchat_3.5",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/openchat/openchat_3.5

How to use openchat/openchat_3.5 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "openchat/openchat_3.5" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "openchat/openchat_3.5",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "openchat/openchat_3.5" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "openchat/openchat_3.5",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use openchat/openchat_3.5 with Docker Model Runner:
```
docker model run hf.co/openchat/openchat_3.5
```

Resources

View closed (13)

add AIBOM

#50 opened 11 months ago by

Online demo gone

#49 opened 12 months ago by

Adding `safetensors` variant of this model

#48 opened over 1 year ago by

Teenage friends

#47 opened over 1 year ago by

Questions about details of dataset usage for reproducing the openchat_3.5

#46 opened about 2 years ago by

Incorrect system prompt in `tokenizer.chat_template`?

#45 opened about 2 years ago by

Which model is your demo page using?

#44 opened over 2 years ago by

What is the best way for the inference process in LORA in PEFT approach

#43 opened over 2 years ago by

Which is the actual way to store the Adapter after PEFT finetuning

#42 opened over 2 years ago by

With use_cache=False, the reponse is taking very long

#41 opened over 2 years ago by

Adding `safetensors` variant of this model

#40 opened over 2 years ago by

PEFT based Fine Tuned model hallucinates values from the fine tuning training data while inferencing

#39 opened over 2 years ago by

should we follow the same openchat prompt structure while finetuning time?

#38 opened over 2 years ago by

Incomplete Output even with max_new_tokens

#37 opened over 2 years ago by

Adding `safetensors` variant of this model

#36 opened over 2 years ago by

Great. But you need to filter out url sources and other gratuitious info.

#35 opened over 2 years ago by deleted

How did Mixtral make openchat_3.5 worse?

#34 opened over 2 years ago by

Some feedback

#33 opened over 2 years ago by

API or local excact same is the huggingface.co/chat ?

#32 opened over 2 years ago by

This is the greatest model for chat yet

#31 opened over 2 years ago by

iNeverLearnedHowToRead

Adding `safetensors` variant of this model

#30 opened over 2 years ago by

Internal Server Error

#28 opened over 2 years ago by

function calling

#24 opened over 2 years ago by

Adding `safetensors` variant of this model

#22 opened over 2 years ago by

[AUTOMATED] Model Memory Requirements

#20 opened over 2 years ago by

model-sizer-bot

Best chat model

#19 opened over 2 years ago by

Add chat_template to tokenizaer_config.json

#18 opened over 2 years ago by

他这个模型有没有推理能力啊

#17 opened over 2 years ago by

Overfit on ChatGPT data

#15 opened over 2 years ago by

What base model does it based?

#14 opened over 2 years ago by

Add widget example

#13 opened over 2 years ago by

Why does it report an error like this when running?

#12 opened over 2 years ago by

Request for Model Sharding of OpenChat 3.5 for Improved Accessibility

#10 opened over 2 years ago by

Great. Now make 128k version like they done with Mistral lately : )

#8 opened over 2 years ago by

Free and ready to use openchat_3.5-GGUF model as OpenAI API compatible endpoint

#7 opened over 2 years ago by

This might help for your next model...

#6 opened over 2 years ago by

How to setup system message

#5 opened over 2 years ago by

fernandofernandes

Hallucinations

#2 opened over 2 years ago by