Rijgersberg commited on
Commit
d1ed926
•
1 Parent(s): c6ed711
Files changed (1) hide show
  1. README.md +23 -14
README.md CHANGED
@@ -3,33 +3,42 @@ license: apache-2.0
3
  base_model: mistralai/Mistral-7B-v0.1
4
  tags:
5
  - generated_from_trainer
 
6
  datasets:
7
- - generator
8
  model-index:
9
  - name: GEITje-v1-7B
10
  results: []
 
 
11
  ---
12
 
13
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
- should probably proofread and complete it, then remove this comment. -->
15
 
16
- # GEITje-v1-7B
 
 
17
 
18
- This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the generator dataset.
19
- It achieves the following results on the evaluation set:
20
- - Loss: 1.3943
21
 
22
  ## Model description
23
 
24
- More information needed
 
 
 
 
 
25
 
26
- ## Intended uses & limitations
27
 
28
- More information needed
 
 
 
 
 
29
 
30
- ## Training and evaluation data
31
-
32
- More information needed
33
 
34
  ## Training procedure
35
 
@@ -108,4 +117,4 @@ The following hyperparameters were used during training:
108
  - Transformers 4.36.0.dev0
109
  - Pytorch 2.1.1+cu121
110
  - Datasets 2.15.0
111
- - Tokenizers 0.15.0
 
3
  base_model: mistralai/Mistral-7B-v0.1
4
  tags:
5
  - generated_from_trainer
6
+ - GEITje
7
  datasets:
8
+ - Rijgersberg/GEITje-pretrain-10b
9
  model-index:
10
  - name: GEITje-v1-7B
11
  results: []
12
+ language:
13
+ - nl
14
  ---
15
 
16
+ # GEITje-7B
 
17
 
18
+ GEITje is a Dutch large open language model with 7 billion parameters, based on Mistral 7B.
19
+ It has been further trained on 10 billion tokens of Dutch text.
20
+ This has improved its Dutch language skills and increased its knowledge of Dutch topics.
21
 
 
 
 
22
 
23
  ## Model description
24
 
25
+ ### _Mistral_ – Base Model
26
+ GEITje is based on [Mistral 7B](https://mistral.ai/news/announcing-mistral-7b/).
27
+ It's a large open language model with 7 billion parameters,
28
+ trained by [Mistral AI](https://mistral.ai).
29
+ According to Mistral AI, the 7B model performs better than [Llama 2](https://ai.meta.com/llama/) 13B on all (English-language) benchmarks they tested it on.
30
+ Mistral 7B has been released under the Apache 2.0 open source license.
31
 
 
32
 
33
+ ### _GEITje_ – Trained Further on Dutch Texts
34
+ GEITje was created by further training Mistral 7B on no less than 10 billion tokens of Dutch text from the [Dutch Gigacorpus](http://gigacorpus.nl) and the [MADLAD-400](https://huggingface.co/datasets/allenai/MADLAD-400) web crawling corpus.
35
+ It is a so-called _full-parameter finetune_:
36
+ performed on all parameters.
37
+ It is not a [PEFT](https://huggingface.co/blog/peft) or [LoRA](https://huggingface.co/docs/peft/conceptual_guides/lora) finetune.
38
+ Like Mistral, GEITje has a _context length_ of 8,192 tokens.
39
 
40
+ ## More info
41
+ Read more about GEITje in the [📄 README](https://github.com/Rijgersberg/GEITje/blob/main/README-en.md) on GitHub.
 
42
 
43
  ## Training procedure
44
 
 
117
  - Transformers 4.36.0.dev0
118
  - Pytorch 2.1.1+cu121
119
  - Datasets 2.15.0
120
+ - Tokenizers 0.15.0