Instructions to use ybelkada/mpt-7b-transformers with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ybelkada/mpt-7b-transformers with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="ybelkada/mpt-7b-transformers")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("ybelkada/mpt-7b-transformers") model = AutoModelForCausalLM.from_pretrained("ybelkada/mpt-7b-transformers") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use ybelkada/mpt-7b-transformers with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "ybelkada/mpt-7b-transformers" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ybelkada/mpt-7b-transformers", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/ybelkada/mpt-7b-transformers
- SGLang
How to use ybelkada/mpt-7b-transformers with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "ybelkada/mpt-7b-transformers" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ybelkada/mpt-7b-transformers", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "ybelkada/mpt-7b-transformers" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ybelkada/mpt-7b-transformers", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use ybelkada/mpt-7b-transformers with Docker Model Runner:
docker model run hf.co/ybelkada/mpt-7b-transformers
Commit History
Update config.json 9311958
Upload MptForCausalLM 0a25285
Upload config 4d4ccd9
Upload config 3b326db
Upload config 63f2cfc
Upload config 067e6d8
Upload config e6322e4
Upload config 8400f02
Upload config d8a52d3
Upload config 1505122
Upload config 5d1916e
Delete pytorch_model.bin 6125777
Upload MptForCausalLM f6d36fd
Upload MptForCausalLM b2d5faf
initial commit 4fc6ec9
Younes Belkada commited on