aari1995
/

germeo-7b-awq

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

aari1995 commited on Jan 5

Commit

183fad2

•

1 Parent(s): d1f632f

Update README.md

Files changed (1) hide show

README.md +76 -1

README.md CHANGED Viewed

@@ -1,3 +1,78 @@
 ---
-license: apache-2.0
 ---

 ---
+language:
+- de
 ---
+# ***WIP***
+(Please bear with me)
+_Hermes + Leo + German AWQ = Germeo_
+# Germeo-7B-AWQ
+A German-English language model merged from [Hermeo-7B](https://https://huggingface.co/malteos/hermeo-7b).
+### Model details
+- **Merged from:** [leo-mistral-hessianai-7b-chat](https://huggingface.co/LeoLM/leo-mistral-hessianai-7b-chat) and [DPOpenHermes-7B-v2](https://huggingface.co/openaccess-ai-collective/DPOpenHermes-7B-v2)
+- **Model type:** Causal decoder-only transformer language model
+- **Languages:** German replies with English Understanding Capabilities
+- **Calibration Data:** [LeoLM/OpenSchnabeltier](https://huggingface.co/datasets/LeoLM/OpenSchnabeltier)
+### Quantization Procedure and Use Case:
+The speciality of this model is that it solely replies in German, independently from the system message or prompt.
+Within the AWQ-process I introduced OpenSchnabeltier as calibration data for the model to stress the importance of German Tokens.
+### Usage
+```python
+# setup [autoawq](https://github.com/casper-hansen/AutoAWQ)
+from awq import AutoAWQForCausalLM
+from transformers import AutoTokenizer, TextStreamer
+quant_path = "aari1995/germeo-7b-awq"
+# Load model
+model = AutoAWQForCausalLM.from_quantized(quant_path, fuse_layers=True)
+tokenizer = AutoTokenizer.from_pretrained(quant_path, trust_remote_code=True)
+```
+### Inference:
+```python
+streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
+# Convert prompt to tokens
+prompt_template = """\
+<|system|>
+You're a helpful assistant</s>
+<|user|>
+{prompt}</s>
+<|assistant|>"""
+prompt = "Schreibe eine Stellenanzeige für Data Scientist bei AXA!"
+tokens = tokenizer(
+    prompt_template.format(prompt=prompt),
+    return_tensors='pt'
+).input_ids.cuda()
+# Generate output
+generation_output = model.generate(
+    tokens,
+    streamer=streamer,
+    max_new_tokens=1012
+)
+# tokenizer.decode(generation_output.flatten())
+```
+### Acknowledgements and Special Thanks
+- Thank you [malteos](https://https://huggingface.co/malteos/)  for hermeo, without this it would not be possible! (and all your other contributions)
+- Thanks to the authors of the base models: [Mistral](https://mistral.ai/), [LAION](https://laion.ai/), [HessianAI](https://hessian.ai/), [Open Access AI Collective](https://huggingface.co/openaccess-ai-collective), [@teknium](https://huggingface.co/teknium), [@bjoernp](https://huggingface.co/bjoernp)
+- Also [@bjoernp](https://huggingface.co/bjoernp) thank you for your contribution and LeoLM for OpenSchnabeltier.
+## Evaluation and Benchmarks
+TBA