Propose of new Readme.md
Browse files
README.md
CHANGED
@@ -15,203 +15,44 @@ inference: false
|
|
15 |
|
16 |
# Bielik-7B-Instruct-v0.1-GGUF
|
17 |
|
18 |
-
|
19 |
|
20 |
-
|
21 |
-
|
22 |
-
|
23 |
-
|
24 |
-
|
25 |
-
|
26 |
-
|
27 |
-
Bielik-7B-Instruct-v0.1 has been trained with the use of an original open source framework called [ALLaMo](https://github.com/chrisociepa/allamo) implemented by [Krzysztof Ociepa](https://www.linkedin.com/in/krzysztof-ociepa-44886550/). This framework allows users to train language models with architecture similar to LLaMA and Mistral in fast and efficient way.
|
28 |
-
|
29 |
-
This repo contains GGUF format model files for [Bielik-7B-Instruct-v0.1](https://huggingface.co/speakleash/Bielik-7B-v0.1). GGUF is a new format introduced by the llama.cpp team on August 21st 2023.
|
30 |
|
|
|
31 |
|
32 |
### Model description:
|
33 |
|
34 |
* **Developed by:** [SpeakLeash](https://speakleash.org/)
|
35 |
* **Language:** Polish
|
36 |
* **Model type:** causal decoder-only
|
|
|
37 |
* **Finetuned from:** [Bielik-7B-v0.1](https://huggingface.co/speakleash/Bielik-7B-v0.1)
|
38 |
* **License:** CC BY NC 4.0 (non-commercial use)
|
39 |
* **Model ref:** speakleash:e38140bea0d48f1218540800bbc67e89
|
40 |
|
41 |
-
|
42 |
-
|
43 |
-
* Framework: [ALLaMo](https://github.com/chrisociepa/allamo)
|
44 |
-
* Visualizations: [W&B](https://wandb.ai)
|
45 |
-
|
46 |
-
<p align="center">
|
47 |
-
<img src="https://huggingface.co/speakleash/Bielik-7B-Instruct-v0.1-GGUF/raw/main/sft_train_loss.png">
|
48 |
-
</p>
|
49 |
-
<p align="center">
|
50 |
-
<img src="https://huggingface.co/speakleash/Bielik-7B-Instruct-v0.1-GGUF/raw/main/sft_train_ppl.png">
|
51 |
-
</p>
|
52 |
-
<p align="center">
|
53 |
-
<img src="https://huggingface.co/speakleash/Bielik-7B-Instruct-v0.1-GGUF/raw/main/sft_train_lr.png">
|
54 |
-
</p>
|
55 |
-
|
56 |
-
### Training hyperparameters:
|
57 |
-
|
58 |
-
| **Hyperparameter** | **Value** |
|
59 |
-
|-----------------------------|------------------|
|
60 |
-
| Micro Batch Size | 1 |
|
61 |
-
| Batch Size | up to 4194304 |
|
62 |
-
| Learning Rate (cosine, adaptive) | 7e-6 -> 6e-7 |
|
63 |
-
| Warmup Iterations | 50 |
|
64 |
-
| All Iterations | 55440 |
|
65 |
-
| Optimizer | AdamW |
|
66 |
-
| β1, β2 | 0.9, 0.95 |
|
67 |
-
| Adam_eps | 1e−8 |
|
68 |
-
| Weight Decay | 0.05 |
|
69 |
-
| Grad Clip | 1.0 |
|
70 |
-
| Precision | bfloat16 (mixed) |
|
71 |
-
|
72 |
-
|
73 |
-
### Instruction format
|
74 |
-
|
75 |
-
In order to leverage instruction fine-tuning, your prompt should be surrounded by `[INST]` and `[/INST]` tokens. The very first instruction should start with the beginning of a sentence token. The generated completion will be finished by the end-of-sentence token.
|
76 |
-
|
77 |
-
E.g.
|
78 |
-
```
|
79 |
-
prompt = "<s>[INST] Jakie mamy pory roku? [/INST]"
|
80 |
-
completion = "W Polsce mamy 4 pory roku: wiosna, lato, jesień i zima.</s>"
|
81 |
-
```
|
82 |
-
|
83 |
-
This format is available as a [chat template](https://huggingface.co/docs/transformers/main/chat_templating) via the `apply_chat_template()` method:
|
84 |
-
|
85 |
-
```python
|
86 |
-
from transformers import AutoModelForCausalLM, AutoTokenizer
|
87 |
-
|
88 |
-
device = "cuda" # the device to load the model onto
|
89 |
-
|
90 |
-
model_name = "speakleash/Bielik-7B-Instruct-v0.1"
|
91 |
-
|
92 |
-
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
93 |
-
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16)
|
94 |
-
|
95 |
-
messages = [
|
96 |
-
{"role": "user", "content": "Jakie mamy pory roku w Polsce?"},
|
97 |
-
{"role": "assistant", "content": "W Polsce mamy 4 pory roku: wiosna, lato, jesień i zima."},
|
98 |
-
{"role": "user", "content": "Która jest najcieplejsza?"}
|
99 |
-
]
|
100 |
-
|
101 |
-
input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt")
|
102 |
-
|
103 |
-
model_inputs = input_ids.to(device)
|
104 |
-
model.to(device)
|
105 |
-
|
106 |
-
generated_ids = model.generate(model_inputs, max_new_tokens=1000, do_sample=True)
|
107 |
-
decoded = tokenizer.batch_decode(generated_ids)
|
108 |
-
print(decoded[0])
|
109 |
-
```
|
110 |
-
|
111 |
-
## Evaluation
|
112 |
-
|
113 |
-
|
114 |
-
Models have been evaluated on [Open PL LLM Leaderboard](https://huggingface.co/spaces/speakleash/open_pl_llm_leaderboard) 5-shot. The benchmark evaluates models in NLP tasks like sentiment analysis, categorization, text classification but does not test chatting skills. Here are presented:
|
115 |
-
- Average - average score among all tasks normalized by baseline scores
|
116 |
-
- Reranking - reranking task, commonly used in RAG
|
117 |
-
- Reader (Generator) - open book question answering task, commonly used in RAG
|
118 |
-
- Perplexity (lower is better) - as a bonus, does not correlate with other scores and should not be used for model comparison
|
119 |
-
|
120 |
-
|
121 |
-
|
122 |
-
| | Average | RAG Reranking | RAG Reader | Perplexity |
|
123 |
-
|--------------------------------------------------------------------------------------|----------:|--------------:|-----------:|-----------:|
|
124 |
-
| **7B parameters models:** | | | | |
|
125 |
-
| Baseline (majority class) | 0.00 | 53.36 | - | - |
|
126 |
-
| Voicelab/trurl-2-7b | 18.85 | 60.67 | 77.19 | 1098.88 |
|
127 |
-
| meta-llama/Llama-2-7b-chat-hf | 21.04 | 54.65 | 72.93 | 4018.74 |
|
128 |
-
| mistralai/Mistral-7B-Instruct-v0.1 | 26.42 | 56.35 | 73.68 | 6909.94 |
|
129 |
-
| szymonrucinski/Curie-7B-v1 | 26.72 | 55.58 | 85.19 | 389.17 |
|
130 |
-
| HuggingFaceH4/zephyr-7b-beta | 33.15 | 71.65 | 71.27 | 3613.14 |
|
131 |
-
| HuggingFaceH4/zephyr-7b-alpha | 33.97 | 71.47 | 73.35 | 4464.45 |
|
132 |
-
| internlm/internlm2-chat-7b-sft | 36.97 | 73.22 | 69.96 | 4269.63 |
|
133 |
-
| internlm/internlm2-chat-7b | 37.64 | 72.29 | 71.17 | 3892.50 |
|
134 |
-
| [Bielik-7B-Instruct-v0.1](https://huggingface.co/speakleash/Bielik-7B-Instruct-v0.1) | 39.28 | 61.89 | **86.00** | 277.92 |
|
135 |
-
| mistralai/Mistral-7B-Instruct-v0.2 | 40.29 | 72.58 | 79.39 | 2088.08 |
|
136 |
-
| teknium/OpenHermes-2.5-Mistral-7B | 42.64 | 70.63 | 80.25 | 1463.00 |
|
137 |
-
| openchat/openchat-3.5-1210 | 44.17 | 71.76 | 82.15 | 1923.83 |
|
138 |
-
| speakleash/mistral_7B-v2/spkl-all_sft_v2/e1_base/spkl-all_2e6-e1_70c70cc6 | 45.44 | 71.27 | 91.50 | 279.24 |
|
139 |
-
| Nexusflow/Starling-LM-7B-beta | 45.69 | 74.58 | 81.22 | 1161.54 |
|
140 |
-
| openchat/openchat-3.5-0106 | 47.32 | 74.71 | 83.60 | 1106.56 |
|
141 |
-
| berkeley-nest/Starling-LM-7B-alpha | **47.46** | **75.73** | 82.86 | 1438.04 |
|
142 |
-
| | | | | |
|
143 |
-
| **Models with different sizes:** | | | | |
|
144 |
-
| Azurro/APT3-1B-Instruct-v1 (1B) | -13.80 | 52.11 | 12.23 | 739.09 |
|
145 |
-
| Voicelab/trurl-2-13b-academic (13B) | 29.45 | 68.19 | 79.88 | 733.91 |
|
146 |
-
| upstage/SOLAR-10.7B-Instruct-v1.0 (10.7B) | 46.07 | 76.93 | 82.86 | 789.58 |
|
147 |
-
| | | | | |
|
148 |
-
| **7B parameters pretrained and continously pretrained models:** | | | | |
|
149 |
-
| OPI-PG/Qra-7b | 11.13 | 54.40 | 75.25 | 203.36 |
|
150 |
-
| meta-llama/Llama-2-7b-hf | 12.73 | 54.02 | 77.92 | 850.45 |
|
151 |
-
| internlm/internlm2-base-7b | 20.68 | 52.39 | 69.85 | 3110.92 |
|
152 |
-
| [Bielik-7B-v0.1](https://huggingface.co/speakleash/Bielik-7B-v0.1) | 29.38 | 62.13 | **88.39** | 123.31 |
|
153 |
-
| mistralai/Mistral-7B-v0.1 | 30.67 | 60.35 | 85.39 | 857.32 |
|
154 |
-
| internlm/internlm2-7b | 33.03 | 69.39 | 73.63 | 5498.23 |
|
155 |
-
| alpindale/Mistral-7B-v0.2-hf | 33.05 | 60.23 | 85.21 | 932.60 |
|
156 |
-
| speakleash/mistral-apt3-7B/spi-e0_hf | 35.50 | 62.14 | **87.48** | 132.78 |
|
157 |
-
|
158 |
-
SpeakLeash models have one of the best scores in the RAG Reader task.
|
159 |
-
We have managed to increase Average score by almost 9 pp. in comparison to Mistral-7B-v0.1.
|
160 |
-
In our subjective evaluations of chatting skills SpeakLeash models perform better than other models with higher Average scores.
|
161 |
-
|
162 |
-
|
163 |
-
|
164 |
-
## Limitations and Biases
|
165 |
-
|
166 |
-
Bielik-7B-Instruct-v0.1 is a quick demonstration that the base model can be easily fine-tuned to achieve compelling and promising performance. It does not have any moderation mechanisms. We're looking forward to engaging with the community in ways to make the model respect guardrails, allowing for deployment in environments requiring moderated outputs.
|
167 |
-
|
168 |
-
Bielik-7B-Instruct-v0.1 can produce factually incorrect output, and should not be relied on to produce factually accurate data. Bielik-7B-Instruct-v0.1 was trained on various public datasets. While great efforts have been taken to clear the training data, it is possible that this model can generate lewd, false, biased or otherwise offensive outputs.
|
169 |
-
|
170 |
-
## License
|
171 |
-
|
172 |
-
Because of an unclear legal situation, we have decided to publish the model under CC BY NC 4.0 license - it allows for non-commercial use. The model can be used for scientific purposes and privately, as long as the license conditions are met.
|
173 |
-
|
174 |
-
## Citation
|
175 |
-
Please cite this model using the following format:
|
176 |
-
|
177 |
-
```
|
178 |
-
@misc{Bielik7Bv01,
|
179 |
-
title = {Introducing Bielik-7B-Instruct-v0.1: Instruct Polish Language Model},
|
180 |
-
author = {Ociepa, Krzysztof and Flis, Łukasz and Wróbel, Krzysztof and Kondracki, Sebastian and {SpeakLeash Team} and {Cyfronet Team}},
|
181 |
-
year = {2024},
|
182 |
-
url = {https://huggingface.co/speakleash/Bielik-7B-Instruct-v0.1},
|
183 |
-
note = {Accessed: 2024-04-01}, % change this date
|
184 |
-
urldate = {2024-04-01} % change this date
|
185 |
-
}
|
186 |
-
```
|
187 |
-
|
188 |
-
## Responsible for training the model
|
189 |
|
190 |
-
|
191 |
-
* [Łukasz Flis](https://www.linkedin.com/in/lukasz-flis-0a39631/)<sup>Cyfronet AGH</sup> - coordinating and supervising the training
|
192 |
-
* [Krzysztof Wróbel](https://www.linkedin.com/in/wrobelkrzysztof/)<sup>SpeakLeash</sup> - benchmarks
|
193 |
-
* [Sebastian Kondracki](https://www.linkedin.com/in/sebastian-kondracki/)<sup>SpeakLeash</sup> - coordinating and preparation of instructions
|
194 |
-
* [Maria Filipkowska](https://www.linkedin.com/in/maria-filipkowska/)<sup>SpeakLeash</sup> - preparation of instructions
|
195 |
-
* [Paweł Kiszczak](https://www.linkedin.com/in/paveu-kiszczak/)<sup>SpeakLeash</sup> - preparation of instructions
|
196 |
-
* [Adrian Gwoździej](https://www.linkedin.com/in/adrgwo/)<sup>SpeakLeash</sup> - data quality and instructions cleaning
|
197 |
-
* [Igor Ciuciura](https://www.linkedin.com/in/igor-ciuciura-1763b52a6/)<sup>SpeakLeash</sup> - instructions cleaning
|
198 |
-
* [Jacek Chwiła](https://www.linkedin.com/in/jacek-chwila/)<sup>SpeakLeash</sup> - instructions cleaning
|
199 |
|
200 |
-
|
201 |
-
[
|
202 |
-
[
|
203 |
-
[
|
204 |
-
[
|
205 |
-
[
|
206 |
-
[
|
207 |
-
[
|
208 |
-
[
|
209 |
-
[
|
210 |
-
[
|
211 |
-
and many other wonderful researchers and enthusiasts of the AI world.
|
212 |
|
213 |
-
Members of the ACK Cyfronet AGH team:
|
214 |
-
[Szymon Mazurek](https://www.linkedin.com/in/sz-mazurek-ai/).
|
215 |
|
216 |
## Contact Us
|
217 |
|
|
|
15 |
|
16 |
# Bielik-7B-Instruct-v0.1-GGUF
|
17 |
|
18 |
+
This repo contains GGUF format model files for [SpeakLeash](https://speakleash.org/)'s [Bielik-7B-Instruct-v0.1](https://huggingface.co/speakleash/Bielik-7B-Instruct-v0.1).
|
19 |
|
20 |
+
<style>
|
21 |
+
du
|
22 |
+
{
|
23 |
+
text-decoration-line: underline;
|
24 |
+
text-decoration-style: double;
|
25 |
+
}
|
26 |
+
</style>
|
|
|
|
|
|
|
27 |
|
28 |
+
<b><du>Remember that quantised models show reduced response quality and possible hallucinations.</du></b><br>
|
29 |
|
30 |
### Model description:
|
31 |
|
32 |
* **Developed by:** [SpeakLeash](https://speakleash.org/)
|
33 |
* **Language:** Polish
|
34 |
* **Model type:** causal decoder-only
|
35 |
+
* **Quant from:** [Bielik-7B-Instruct-v0.1](https://huggingface.co/speakleash/Bielik-7B-Instruct-v0.1)
|
36 |
* **Finetuned from:** [Bielik-7B-v0.1](https://huggingface.co/speakleash/Bielik-7B-v0.1)
|
37 |
* **License:** CC BY NC 4.0 (non-commercial use)
|
38 |
* **Model ref:** speakleash:e38140bea0d48f1218540800bbc67e89
|
39 |
|
40 |
+
### About GGUF
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41 |
|
42 |
+
GGUF is a new format introduced by the llama.cpp team on August 21st 2023.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
43 |
|
44 |
+
Here is an incomplete list of clients and libraries that are known to support GGUF:
|
45 |
+
* [llama.cpp](https://github.com/ggerganov/llama.cpp). The source project for GGUF. Offers a CLI and a server option.
|
46 |
+
* [text-generation-webui](https://github.com/oobabooga/text-generation-webui), the most widely used web UI, with many features and powerful extensions. Supports GPU acceleration.
|
47 |
+
* [KoboldCpp](https://github.com/LostRuins/koboldcpp), a fully featured web UI, with GPU accel across all platforms and GPU architectures. Especially good for story telling.
|
48 |
+
* [GPT4All](https://gpt4all.io/index.html), a free and open source local running GUI, supporting Windows, Linux and macOS with full GPU accel.
|
49 |
+
* [LM Studio](https://lmstudio.ai/), an easy-to-use and powerful local GUI for Windows and macOS (Silicon), with GPU acceleration. Linux available, in beta as of 27/11/2023.
|
50 |
+
* [LoLLMS Web UI](https://github.com/ParisNeo/lollms-webui), a great web UI with many interesting and unique features, including a full model library for easy model selection.
|
51 |
+
* [Faraday.dev](https://faraday.dev/), an attractive and easy to use character-based chat GUI for Windows and macOS (both Silicon and Intel), with GPU acceleration.
|
52 |
+
* [llama-cpp-python](https://github.com/abetlen/llama-cpp-python), a Python library with GPU accel, LangChain support, and OpenAI-compatible API server.
|
53 |
+
* [candle](https://github.com/huggingface/candle), a Rust ML framework with a focus on performance, including GPU support, and ease of use.
|
54 |
+
* [ctransformers](https://github.com/marella/ctransformers), a Python library with GPU accel, LangChain support, and OpenAI-compatible AI server. Note, as of time of writing (November 27th 2023), ctransformers has not been updated in a long time and does not support many recent models.
|
|
|
55 |
|
|
|
|
|
56 |
|
57 |
## Contact Us
|
58 |
|