germeo-7b-awq / README.md
aari1995's picture
Update README.md
183fad2
|
raw
history blame
2.58 kB
metadata
language:
  - de

WIP

(Please bear with me)

Hermes + Leo + German AWQ = Germeo

Germeo-7B-AWQ

A German-English language model merged from Hermeo-7B.

Model details

Quantization Procedure and Use Case:

The speciality of this model is that it solely replies in German, independently from the system message or prompt. Within the AWQ-process I introduced OpenSchnabeltier as calibration data for the model to stress the importance of German Tokens.

Usage

# setup [autoawq](https://github.com/casper-hansen/AutoAWQ)
from awq import AutoAWQForCausalLM
from transformers import AutoTokenizer, TextStreamer

quant_path = "aari1995/germeo-7b-awq"

# Load model
model = AutoAWQForCausalLM.from_quantized(quant_path, fuse_layers=True)
tokenizer = AutoTokenizer.from_pretrained(quant_path, trust_remote_code=True)

Inference:

streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)

# Convert prompt to tokens
prompt_template = """\
<|system|>
You're a helpful assistant</s>
<|user|>
{prompt}</s>
<|assistant|>"""

prompt = "Schreibe eine Stellenanzeige für Data Scientist bei AXA!"

tokens = tokenizer(
    prompt_template.format(prompt=prompt), 
    return_tensors='pt'
).input_ids.cuda()

# Generate output
generation_output = model.generate(
    tokens, 
    streamer=streamer,
    max_new_tokens=1012
)
# tokenizer.decode(generation_output.flatten())

Acknowledgements and Special Thanks

Evaluation and Benchmarks

TBA