Instructions to use PygmalionAI/pygmalion-6b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use PygmalionAI/pygmalion-6b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="PygmalionAI/pygmalion-6b") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("PygmalionAI/pygmalion-6b") model = AutoModelForCausalLM.from_pretrained("PygmalionAI/pygmalion-6b") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use PygmalionAI/pygmalion-6b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "PygmalionAI/pygmalion-6b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "PygmalionAI/pygmalion-6b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/PygmalionAI/pygmalion-6b
- SGLang
How to use PygmalionAI/pygmalion-6b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "PygmalionAI/pygmalion-6b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "PygmalionAI/pygmalion-6b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "PygmalionAI/pygmalion-6b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "PygmalionAI/pygmalion-6b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use PygmalionAI/pygmalion-6b with Docker Model Runner:
docker model run hf.co/PygmalionAI/pygmalion-6b
The bot can't differentiate between commenting the situation or talking to the bot
Example:
Me: I eat a popcorn while we watch the movie
Bot: Sure you can eat a popcorn while we watch the movie
I mean it is not a big problem, just providing feedback. The model and colab works very well! I hope you will improve it more. Thank you for your work.
add * to the start and end without spcaes. This seems to work every time i've used it in testing, and again beat out OPT on recognizing the YOUR TEXT HERE PLS as an action.
I eat popcorn this tastes great.
Tried it. Sometimes it works yeah thanks. The later part of your sentence i dont understand. Is it an configuration option in the gradio app?
Thank you! One more question please. Is it possible to deal with the very short term memory of the model? It only remembers 2-3 lines for me. Is it a hardware thing or the model needs more training?
I'm not sure what you mean by that, it depends more on the software you're using for inference on the model. Look up kobold AI, it's another software that lets you run this and other models with much better chat features and access to a bunch of settings. The defaults work best with this model but you can still play around with them. It also lets you give the model context up to 2056 tokens (it can go higher but don't because it WILL collapse) which is around 2,010 words. That let's it hold context for a few paragraphs, and it also supports residual memory so it can store core details in permanent memory.
Yeah it was the GUI i used was bad... With koboldai and tavern ai the model can remember more things
