PEFT
Safetensors
German
trl
sft
Generated from Trainer
File size: 1,802 Bytes
39e71b0
 
2296696
39e71b0
 
 
 
 
 
 
340f03d
 
 
 
c694902
39e71b0
 
2296696
39e71b0
0f28bc4
32b4f00
aa39935
32b4f00
 
 
 
 
 
 
 
 
 
7e69488
32b4f00
940fac7
32b4f00
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
---
library_name: peft
base_model: LSX-UniWue/LLaMmlein_1B
tags:
- trl
- sft
- generated_from_trainer
model-index:
- name: LLaMmlein_1b_chat_guanako
  results: []
datasets:
- LSX-UniWue/Guanako
language:
- de
license: other
---

# LLäMmlein 1B Chat

This is a chat adapter for the German Tinyllama 1B language model.
Find more details on our [page](https://www.informatik.uni-wuerzburg.de/datascience/projects/nlp/llammlein/) and our [preprint](arxiv.org/abs/2411.11171)!
We also merged the adapter and converted it to GGUF [here](LSX-UniWue/LLaMmlein_1B_alternative_formats).

## Run it
```py
import torch
from peft import PeftConfig, PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

torch.manual_seed(42)

# script config
base_model_name = "LSX-UniWue/LLaMmlein_1B"
chat_adapter_name = "LSX-UniWue/LLaMmlein_1B_chat_guanako"
device = "cuda"  # or mps

# chat history
messages = [
    {
        "role": "user",
        "content": """Na wie geht's?""",
    },
]

# load model
config = PeftConfig.from_pretrained(chat_adapter_name)
base_model = model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    torch_dtype=torch.bfloat16,
    device_map=device,
)
base_model.resize_token_embeddings(32064)
model = PeftModel.from_pretrained(base_model, chat_adapter_name)
tokenizer = AutoTokenizer.from_pretrained(chat_adapter_name)

# encode message in "ChatML" format
chat = tokenizer.apply_chat_template(
    messages,
    return_tensors="pt",
    add_generation_prompt=True,
).to(device)

# generate response
print(
    tokenizer.decode(
        model.generate(
            chat,
            max_new_tokens=300,
            pad_token_id=tokenizer.pad_token_id,
            eos_token_id=tokenizer.eos_token_id,
        )[0],
        skip_special_tokens=False,
    )
)

```