Instructions to use nbeerbower/Mistral-Nemo-Gutenberg-Doppel-12B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use nbeerbower/Mistral-Nemo-Gutenberg-Doppel-12B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="nbeerbower/Mistral-Nemo-Gutenberg-Doppel-12B") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("nbeerbower/Mistral-Nemo-Gutenberg-Doppel-12B") model = AutoModelForCausalLM.from_pretrained("nbeerbower/Mistral-Nemo-Gutenberg-Doppel-12B") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use nbeerbower/Mistral-Nemo-Gutenberg-Doppel-12B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "nbeerbower/Mistral-Nemo-Gutenberg-Doppel-12B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "nbeerbower/Mistral-Nemo-Gutenberg-Doppel-12B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/nbeerbower/Mistral-Nemo-Gutenberg-Doppel-12B
- SGLang
How to use nbeerbower/Mistral-Nemo-Gutenberg-Doppel-12B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "nbeerbower/Mistral-Nemo-Gutenberg-Doppel-12B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "nbeerbower/Mistral-Nemo-Gutenberg-Doppel-12B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "nbeerbower/Mistral-Nemo-Gutenberg-Doppel-12B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "nbeerbower/Mistral-Nemo-Gutenberg-Doppel-12B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use nbeerbower/Mistral-Nemo-Gutenberg-Doppel-12B with Docker Model Runner:
docker model run hf.co/nbeerbower/Mistral-Nemo-Gutenberg-Doppel-12B
Time for a new Instruct tune?
This is easily a top tier nemo model. It retains the ability to follow instructions and have an actual narrative voice, unlike many RP tunes. I actually prefer the mistral formatting since it keeps the system prompt upfront - not deep in the context. If I had to guess why people had bad results with nemo to begin with, I'd say the lack of 'optimal' presets in ST and info in general covering V3 Tekken's placements was and is the number one issue.
That being said, will you consider another train on nemo instruct with newer datasets and keep mistral's format?
Here is the template I made after reading the mistral cookbook and a few ST docs.
-Thank You Nick!
I'm a sucker for Nemo so sure, I can try a new Gutenberg with Mistral Instruct. :)
Thanks for your feedback and sharing your template!
That was quick! I will be watching your repo >:)
I fixed the link - it was on temp storage ( silly me )
Might as well post what I wrote in the Kobold discord here.
Upon further inspection of mistral cookbook, the default mistral nemo templates provided within ST are suboptimal. The System prompt can be prepended to the newest user message before the input content.
⚠️ This is how mistral_common and the templates implement system prompts, but this can easily be customized. Feel free to use system prompts in different places, such as the second from the last or simply as the first user message, as before.
Character post-history and default post-history should be disabled and placed into their respective system prompt fields instead or moved elsewhere. Ideally there would be a ST script to catch random depth injections from character notes and put them in a sane spot ( I might revisit this? )


