|
--- |
|
license: other |
|
language: |
|
- en |
|
pipeline_tag: text-generation |
|
inference: false |
|
tags: |
|
- transformers |
|
- gguf |
|
- imatrix |
|
- Saul-Instruct-v1 |
|
--- |
|
Quantizations of https://huggingface.co/Equall/Saul-Instruct-v1 |
|
|
|
Note: not sure why but Q2_K, Q3_K_S, Q4_0 and Q5_0 gave error during quantizations: "ggml_validate_row_data: found nan value at block xxx", so I skipped those quants. |
|
|
|
# From original readme |
|
|
|
## Uses |
|
|
|
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. --> |
|
You can use it for legal use cases that involves generation. |
|
|
|
Here's how you can run the model using the pipeline() function from 🤗 Transformers: |
|
|
|
```python |
|
|
|
# Install transformers from source - only needed for versions <= v4.34 |
|
# pip install git+https://github.com/huggingface/transformers.git |
|
# pip install accelerate |
|
|
|
import torch |
|
from transformers import pipeline |
|
|
|
pipe = pipeline("text-generation", model="Equall/Saul-Instruct-v1", torch_dtype=torch.bfloat16, device_map="auto") |
|
# We use the tokenizer’s chat template to format each message - see https://huggingface.co/docs/transformers/main/en/chat_templating |
|
messages = [ |
|
{"role": "user", "content": "[YOUR QUERY GOES HERE]"}, |
|
] |
|
prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) |
|
outputs = pipe(prompt, max_new_tokens=256, do_sample=False) |
|
print(outputs[0]["generated_text"]) |
|
``` |