LSX-UniWue
/

LLaMmlein_1B_chat_selected

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

JanPf commited on Nov 20, 2024

Commit

f9a712c

·

verified ·

1 Parent(s): a2f572a

Update README.md

Files changed (1) hide show

README.md +56 -1

README.md CHANGED Viewed

@@ -20,4 +20,59 @@ license: other
 # LLäMmlein 1B Chat
 This is a chat adapter for the German Tinyllama 1B language model.
-Find more details on our [page](https://www.informatik.uni-wuerzburg.de/datascience/projects/nlp/llammlein/) and our [preprint](arxiv.org/abs/2411.11171)!

 # LLäMmlein 1B Chat
 This is a chat adapter for the German Tinyllama 1B language model.
+Find more details on our [page](https://www.informatik.uni-wuerzburg.de/datascience/projects/nlp/llammlein/) and our [preprint](arxiv.org/abs/2411.11171)!
+## Run it
+```py
+import torch
+from peft import PeftConfig, PeftModel
+from transformers import AutoModelForCausalLM, AutoTokenizer
+torch.manual_seed(42)
+# script config
+base_model_name = "LSX-UniWue/llammchen_1b"
+chat_adapter_name = "LSX-UniWue/LLaMmlein_1B_chat_selected"
+device = "mps"  # or cuda
+# chat history
+messages = [
+    {
+        "role": "user",
+        "content": """Na wie geht's?""",
+    },
+]
+# load model
+config = PeftConfig.from_pretrained(chat_adapter_name)
+base_model = model = AutoModelForCausalLM.from_pretrained(
+    base_model_name,
+    attn_implementation="flash_attention_2" if device == "cuda" else None,
+    torch_dtype=torch.bfloat16,
+    device_map=device,
+)
+base_model.resize_token_embeddings(32064)
+model = PeftModel.from_pretrained(base_model, chat_adapter_name)
+tokenizer = AutoTokenizer.from_pretrained(chat_adapter_name)
+# encode message in "ChatML" format
+chat = tokenizer.apply_chat_template(
+    messages,
+    return_tensors="pt",
+    add_generation_prompt=True,
+).to(device)
+# generate response
+print(
+    tokenizer.decode(
+        model.generate(
+            chat,
+            max_new_tokens=300,
+            pad_token_id=tokenizer.pad_token_id,
+            eos_token_id=tokenizer.eos_token_id,
+        )[0],
+        skip_special_tokens=False,
+    )
+)
+```