moxin-org
/

moxin-chat-7b

Inference Endpoints

Model card Files Files and versions Community

piuzha commited on Dec 10, 2024

Commit

f02a1dd

·

verified ·

1 Parent(s): a14b4fb

Update README.md

Files changed (1) hide show

README.md +25 -0

README.md CHANGED Viewed

@@ -59,6 +59,31 @@ print(sequences[0]['generated_text'])
 ## Chat template
 ## Evaluation

 ## Chat template
+The chat template is available via the apply_chat_template() method:
+```
+from transformers import AutoModelForCausalLM, AutoTokenizer
+device = "cuda"
+model = AutoModelForCausalLM.from_pretrained("moxin-org/moxin-chat-7b")
+tokenizer = AutoTokenizer.from_pretrained("moxin-org/moxin-chat-7b")
+messages = [
+    {"role": "user", "content": "What is your favourite condiment?"},
+    {"role": "assistant", "content": "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!"},
+    {"role": "user", "content": "Do you have mayonnaise recipes?"}
+]
+encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")
+model_inputs = encodeds.to(device)
+model.to(device)
+generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
+decoded = tokenizer.batch_decode(generated_ids)
+print(decoded[0])
+```
 ## Evaluation