Instructions to use deepseek-ai/DeepSeek-V2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use deepseek-ai/DeepSeek-V2 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="deepseek-ai/DeepSeek-V2", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-V2", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("deepseek-ai/DeepSeek-V2", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use deepseek-ai/DeepSeek-V2 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "deepseek-ai/DeepSeek-V2"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "deepseek-ai/DeepSeek-V2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/deepseek-ai/DeepSeek-V2

SGLang

How to use deepseek-ai/DeepSeek-V2 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "deepseek-ai/DeepSeek-V2" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "deepseek-ai/DeepSeek-V2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "deepseek-ai/DeepSeek-V2" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "deepseek-ai/DeepSeek-V2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use deepseek-ai/DeepSeek-V2 with Docker Model Runner:
```
docker model run hf.co/deepseek-ai/DeepSeek-V2
```

DeepSeek-V2

Commit History

docs: update README.md

4461458
verified

Benjamin-eecs commited on Jun 8, 2024

docs: update README.md

f9b5644
verified

Benjamin-eecs commited on Jun 7, 2024

training: fix type mismatch when training (#6)

94cf30b
verified

luofuli

Jack477 commited on May 27, 2024

Update code & readme

e0828e3

msr2000 commited on May 9, 2024

Update README.md

3d3dec2
verified

luofuli commited on May 8, 2024

update citation info

1e51912
verified

DeepSeekDDM commited on May 8, 2024

Update README.md

3b7037d

msr2000 commited on May 7, 2024

Update README.md

ca159e6

msr2000 commited on May 7, 2024

Update README.md

d734100
verified

DeepSeekDDM commited on May 7, 2024

Update model names

44f7caf

msr2000 commited on May 6, 2024

Update README.md

3d445e7
verified

luofuli commited on May 6, 2024

Update README.md

70cd796
verified

luofuli commited on May 6, 2024

Update README.md

e8c35ff

马仕镕 commited on May 6, 2024

Update README.md

1142605
verified

zqh11 commited on May 6, 2024

Update README.md

dd5954a
verified

zqh11 commited on May 6, 2024

Update README.md

b3333a6
verified

luofuli commited on May 6, 2024

Upload LICENSE

e63dbae
verified

luofuli commited on May 6, 2024

Upload folder using huggingface_hub

4c2b2d1
verified

msr2000 commited on May 6, 2024

initial commit

b129598
verified

luofuli commited on Apr 22, 2024

Commit History

docs: update README.md 4461458 verified

docs: update README.md f9b5644 verified

training: fix type mismatch when training (#6) 94cf30b verified

Update code & readme e0828e3

Update README.md 3d3dec2 verified

update citation info 1e51912 verified

Update README.md 3b7037d

Update README.md ca159e6

Update README.md d734100 verified

Update model names 44f7caf

Update README.md 3d445e7 verified

Update README.md 70cd796 verified

Update README.md e8c35ff

Update README.md 1142605 verified

Update README.md dd5954a verified

Update README.md b3333a6 verified

Upload LICENSE e63dbae verified

Upload folder using huggingface_hub 4c2b2d1 verified

initial commit b129598 verified

docs: update README.md

4461458
verified

docs: update README.md

f9b5644
verified

training: fix type mismatch when training (#6)

94cf30b
verified

Update code & readme

e0828e3

Update README.md

3d3dec2
verified

update citation info

1e51912
verified

Update README.md

3b7037d

Update README.md

ca159e6

Update README.md

d734100
verified

Update model names

44f7caf

Update README.md

3d445e7
verified

Update README.md

70cd796
verified

Update README.md

e8c35ff

Update README.md

1142605
verified

Update README.md

dd5954a
verified

Update README.md

b3333a6
verified

Upload LICENSE

e63dbae
verified

Upload folder using huggingface_hub

4c2b2d1
verified

initial commit

b129598
verified