szymonrucinski
commited on
Commit
•
f15815f
1
Parent(s):
7b7a4e3
Update README.md
Browse files
README.md
CHANGED
@@ -7,7 +7,19 @@ tags:
|
|
7 |
- polish
|
8 |
- nlp
|
9 |
---
|
10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
|
12 |
## Introduction
|
13 |
This research demonstrates the potential of fine-tuning English Large Language Models (LLMs) for Polish text generation. By employing Language Adaptive Pre-training (LAPT) on a high-quality dataset of 3.11 GB (276 million Polish tokens) and subsequent fine-tuning on the [KLEJ challenges](https://klejbenchmark.com), the `Curie-7B-v1` model achieves remarkable performance. It not only generates Polish text with the lowest perplexity of 3.02 among decoder-based models but also rivals the best Polish encoder-decoder models closely, with a minimal performance gap on 8 out of 9 tasks. This was accomplished using about 2-3% of the dataset size typically required, showcasing the method's efficiency. The model is now open-source, contributing to the community's collaborative progress.
|
|
|
7 |
- polish
|
8 |
- nlp
|
9 |
---
|
10 |
+
<style>
|
11 |
+
@import url('https://fonts.googleapis.com/css2?family=Pacifico&display=swap')
|
12 |
+
.markdown-custom-font {
|
13 |
+
font-family: "Pacifico", cursive;
|
14 |
+
font-weight: 400;
|
15 |
+
font-style: normal;
|
16 |
+
}
|
17 |
+
</style>
|
18 |
+
|
19 |
+
<div class="markdown-custom-font" align="center">
|
20 |
+
<img src="logo.png" alt="Logo" width="300">
|
21 |
+
Curie-7B-v1
|
22 |
+
</div>
|
23 |
|
24 |
## Introduction
|
25 |
This research demonstrates the potential of fine-tuning English Large Language Models (LLMs) for Polish text generation. By employing Language Adaptive Pre-training (LAPT) on a high-quality dataset of 3.11 GB (276 million Polish tokens) and subsequent fine-tuning on the [KLEJ challenges](https://klejbenchmark.com), the `Curie-7B-v1` model achieves remarkable performance. It not only generates Polish text with the lowest perplexity of 3.02 among decoder-based models but also rivals the best Polish encoder-decoder models closely, with a minimal performance gap on 8 out of 9 tasks. This was accomplished using about 2-3% of the dataset size typically required, showcasing the method's efficiency. The model is now open-source, contributing to the community's collaborative progress.
|