gemma-2-9b-it-fix-system-role

Modified version of gemma-2-2b-it and update chat_template for support system role to handle cases:

  • Conversation roles must alternate user/assistant/user/assistant/...
  • System role not supported

Model Overview

  • Model Architecture: Gemma 2
    • Input: Text
    • Output: Text
  • Release Date: 04/12/2024
  • Version: 1.0

Deployment

Use with vLLM

This model can be deployed efficiently using the vLLM backend, as shown in the example below.

With CLI:

vllm serve --model dangvansam/gemma-2-2b-it-fix-system-role
curl http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
  "model": "dangvansam/gemma-2-2b-it-fix-system-role",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Who are you?"}
  ]
}'

With Python:

from vllm import LLM, SamplingParams
from transformers import AutoTokenizer

model_id = "dangvansam/gemma-2-2b-it-fix-system-role"

sampling_params = SamplingParams(temperature=0.6, top_p=0.9, max_tokens=256)

tokenizer = AutoTokenizer.from_pretrained(model_id)

messages = [
  {"role": "system", "content": "You are helpfull assistant."},
  {"role": "user", "content": "Who are you?"}
]

prompts = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

llm = LLM(model=model_id)

outputs = llm.generate(prompts, sampling_params)

generated_text = outputs[0].outputs[0].text
print(generated_text)

vLLM also supports OpenAI-compatible serving. See the documentation for more details.

Downloads last month
17
Safetensors
Model size
2.61B params
Tensor type
BF16
·
Inference Examples
Unable to determine this model's library. Check the docs .

Model tree for dangvansam/gemma-2-2b-it-fix-system-role

Base model

google/gemma-2-2b
Finetuned
(130)
this model