Rijgersberg
/

GEITje-7B

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

Rijgersberg commited on Dec 15, 2023

Commit

d1ed926

•

1 Parent(s): c6ed711

Add info

Files changed (1) hide show

README.md +23 -14

README.md CHANGED Viewed

@@ -3,33 +3,42 @@ license: apache-2.0
 base_model: mistralai/Mistral-7B-v0.1
 tags:
 - generated_from_trainer
 datasets:
-- generator
 model-index:
 - name: GEITje-v1-7B
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# GEITje-v1-7B
-This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the generator dataset.
-It achieves the following results on the evaluation set:
-- Loss: 1.3943
 ## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
 ## Training procedure
@@ -108,4 +117,4 @@ The following hyperparameters were used during training:
 - Transformers 4.36.0.dev0
 - Pytorch 2.1.1+cu121
 - Datasets 2.15.0
-- Tokenizers 0.15.0

 base_model: mistralai/Mistral-7B-v0.1
 tags:
 - generated_from_trainer
+- GEITje
 datasets:
+- Rijgersberg/GEITje-pretrain-10b
 model-index:
 - name: GEITje-v1-7B
   results: []
+language:
+- nl
 ---
+# GEITje-7B
+GEITje is a Dutch large open language model with 7 billion parameters, based on Mistral 7B.
+It has been further trained on 10 billion tokens of Dutch text.
+This has improved its Dutch language skills and increased its knowledge of Dutch topics.
 ## Model description
+### _Mistral_ – Base Model
+GEITje is based on [Mistral 7B](https://mistral.ai/news/announcing-mistral-7b/).
+It's a large open language model with 7 billion parameters,
+trained by [Mistral AI](https://mistral.ai).
+According to Mistral AI, the 7B model performs better than [Llama 2](https://ai.meta.com/llama/) 13B on all (English-language) benchmarks they tested it on.
+Mistral 7B has been released under the Apache 2.0 open source license.
+### _GEITje_ – Trained Further on Dutch Texts
+GEITje was created by further training Mistral 7B on no less than 10 billion tokens of Dutch text from the [Dutch Gigacorpus](http://gigacorpus.nl) and the [MADLAD-400](https://huggingface.co/datasets/allenai/MADLAD-400) web crawling corpus.
+It is a so-called _full-parameter finetune_:
+performed on all parameters.
+It is not a [PEFT](https://huggingface.co/blog/peft) or [LoRA](https://huggingface.co/docs/peft/conceptual_guides/lora) finetune.
+Like Mistral, GEITje has a _context length_ of 8,192 tokens.
+## More info
+Read more about GEITje in the [📄 README](https://github.com/Rijgersberg/GEITje/blob/main/README-en.md) on GitHub.
 ## Training procedure
 - Transformers 4.36.0.dev0
 - Pytorch 2.1.1+cu121
 - Datasets 2.15.0
+- Tokenizers 0.15.0