Rijgersberg
commited on
Commit
•
d1ed926
1
Parent(s):
c6ed711
Add info
Browse files
README.md
CHANGED
@@ -3,33 +3,42 @@ license: apache-2.0
|
|
3 |
base_model: mistralai/Mistral-7B-v0.1
|
4 |
tags:
|
5 |
- generated_from_trainer
|
|
|
6 |
datasets:
|
7 |
-
-
|
8 |
model-index:
|
9 |
- name: GEITje-v1-7B
|
10 |
results: []
|
|
|
|
|
11 |
---
|
12 |
|
13 |
-
|
14 |
-
should probably proofread and complete it, then remove this comment. -->
|
15 |
|
16 |
-
|
|
|
|
|
17 |
|
18 |
-
This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the generator dataset.
|
19 |
-
It achieves the following results on the evaluation set:
|
20 |
-
- Loss: 1.3943
|
21 |
|
22 |
## Model description
|
23 |
|
24 |
-
|
|
|
|
|
|
|
|
|
|
|
25 |
|
26 |
-
## Intended uses & limitations
|
27 |
|
28 |
-
|
|
|
|
|
|
|
|
|
|
|
29 |
|
30 |
-
##
|
31 |
-
|
32 |
-
More information needed
|
33 |
|
34 |
## Training procedure
|
35 |
|
@@ -108,4 +117,4 @@ The following hyperparameters were used during training:
|
|
108 |
- Transformers 4.36.0.dev0
|
109 |
- Pytorch 2.1.1+cu121
|
110 |
- Datasets 2.15.0
|
111 |
-
- Tokenizers 0.15.0
|
|
|
3 |
base_model: mistralai/Mistral-7B-v0.1
|
4 |
tags:
|
5 |
- generated_from_trainer
|
6 |
+
- GEITje
|
7 |
datasets:
|
8 |
+
- Rijgersberg/GEITje-pretrain-10b
|
9 |
model-index:
|
10 |
- name: GEITje-v1-7B
|
11 |
results: []
|
12 |
+
language:
|
13 |
+
- nl
|
14 |
---
|
15 |
|
16 |
+
# GEITje-7B
|
|
|
17 |
|
18 |
+
GEITje is a Dutch large open language model with 7 billion parameters, based on Mistral 7B.
|
19 |
+
It has been further trained on 10 billion tokens of Dutch text.
|
20 |
+
This has improved its Dutch language skills and increased its knowledge of Dutch topics.
|
21 |
|
|
|
|
|
|
|
22 |
|
23 |
## Model description
|
24 |
|
25 |
+
### _Mistral_ – Base Model
|
26 |
+
GEITje is based on [Mistral 7B](https://mistral.ai/news/announcing-mistral-7b/).
|
27 |
+
It's a large open language model with 7 billion parameters,
|
28 |
+
trained by [Mistral AI](https://mistral.ai).
|
29 |
+
According to Mistral AI, the 7B model performs better than [Llama 2](https://ai.meta.com/llama/) 13B on all (English-language) benchmarks they tested it on.
|
30 |
+
Mistral 7B has been released under the Apache 2.0 open source license.
|
31 |
|
|
|
32 |
|
33 |
+
### _GEITje_ – Trained Further on Dutch Texts
|
34 |
+
GEITje was created by further training Mistral 7B on no less than 10 billion tokens of Dutch text from the [Dutch Gigacorpus](http://gigacorpus.nl) and the [MADLAD-400](https://huggingface.co/datasets/allenai/MADLAD-400) web crawling corpus.
|
35 |
+
It is a so-called _full-parameter finetune_:
|
36 |
+
performed on all parameters.
|
37 |
+
It is not a [PEFT](https://huggingface.co/blog/peft) or [LoRA](https://huggingface.co/docs/peft/conceptual_guides/lora) finetune.
|
38 |
+
Like Mistral, GEITje has a _context length_ of 8,192 tokens.
|
39 |
|
40 |
+
## More info
|
41 |
+
Read more about GEITje in the [📄 README](https://github.com/Rijgersberg/GEITje/blob/main/README-en.md) on GitHub.
|
|
|
42 |
|
43 |
## Training procedure
|
44 |
|
|
|
117 |
- Transformers 4.36.0.dev0
|
118 |
- Pytorch 2.1.1+cu121
|
119 |
- Datasets 2.15.0
|
120 |
+
- Tokenizers 0.15.0
|