Update README.md
Browse files
README.md
CHANGED
@@ -16,10 +16,11 @@ tags:
|
|
16 |
|
17 |
# Academic Trurl 2 -- Polish Llama 2
|
18 |
|
19 |
-
The
|
20 |
TRURL was trained on a large number of Polish data.
|
|
|
21 |
TRURL 2 is a collection of fine-tuned generative text models with 7 billion and 13 billion parameters.
|
22 |
-
This is the repository for the 13B fine-tuned model, optimized for dialogue use cases.
|
23 |
This model was trained without MMLU dataset.
|
24 |
|
25 |
|
@@ -37,9 +38,9 @@ This model was trained without MMLU dataset.
|
|
37 |
|
38 |
||Training Data|Params|Content Length|Num. Samples|Num. Tokens|start LR|
|
39 |
|---|---|---|---|---|---|---|
|
40 |
-
|Trurl 2|*A new mix of private and publicly available online data without MMLU*|7B|4k|
|
41 |
|Trurl 2|*A new mix of private and publicly available online data with MMLU*|13B|4k|970k|1.7b|2.0 x 10<sup>-5</sup>|
|
42 |
-
|Trurl 2 Academic|*A new mix of private and publicly available online data without MMLU*|13B|4k|
|
43 |
|
44 |
## Training data
|
45 |
|
|
|
16 |
|
17 |
# Academic Trurl 2 -- Polish Llama 2
|
18 |
|
19 |
+
The Academic TRURL is a finetuned Llama 2, trained on over 1.7b tokens (855k conversational **Polish** and **English** samples) with a large context of 4096 tokens.
|
20 |
TRURL was trained on a large number of Polish data.
|
21 |
+
|
22 |
TRURL 2 is a collection of fine-tuned generative text models with 7 billion and 13 billion parameters.
|
23 |
+
This is the repository for the Academic 13B fine-tuned model, optimized for dialogue use cases.
|
24 |
This model was trained without MMLU dataset.
|
25 |
|
26 |
|
|
|
38 |
|
39 |
||Training Data|Params|Content Length|Num. Samples|Num. Tokens|start LR|
|
40 |
|---|---|---|---|---|---|---|
|
41 |
+
|Trurl 2|*A new mix of private and publicly available online data without MMLU*|7B|4k|855k|1.19b|2.0 x 10<sup>-5</sup>|
|
42 |
|Trurl 2|*A new mix of private and publicly available online data with MMLU*|13B|4k|970k|1.7b|2.0 x 10<sup>-5</sup>|
|
43 |
+
|Trurl 2 Academic|*A new mix of private and publicly available online data without MMLU*|13B|4k|855k|1.19b|2.0 x 10<sup>-5</sup>|
|
44 |
|
45 |
## Training data
|
46 |
|