Update README.md
Browse files
README.md
CHANGED
@@ -30,6 +30,7 @@ Aligned with **DPO**
|
|
30 |
2. [Model Details](#model-details)
|
31 |
- [Prompt template](#prompt-template)
|
32 |
- [Training Dataset](#training-dataset)
|
|
|
33 |
3. [Evaluation](#evaluation)
|
34 |
5. [Disclaimer](#disclaimer)
|
35 |
6. [Contact](#contact)
|
@@ -55,11 +56,29 @@ Aligned with **DPO**
|
|
55 |
|
56 |
SauerkrautLM-Mixtral-8x7B-Instruct was trained with mix of German data augmentation and translated data.
|
57 |
Aligned through **DPO** with our **new German SauerkrautLM-DPO dataset** based on parts of the SFT SauerkrautLM dataset
|
58 |
-
as chosen answers and [Sauerkraut-7b-HerO](https://huggingface.co/VAGOsolutions/SauerkrautLM-7b-HerO) as rejected answers. Added with additional **translated Parts of the [HuggingFaceH4/ultrafeedback_binarized](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized)** and **[argilla/distilabel-math-preference-dpo](https://huggingface.co/datasets/argilla/distilabel-math-preference-dpo).**
|
59 |
We found, that only a simple translation of training data can lead to unnatural German phrasings.
|
60 |
Data augmentation techniques were used to grant grammatical, syntactical correctness and a more natural German wording in our training data.
|
61 |
|
|
|
62 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
63 |
|
64 |
### Prompt Template:
|
65 |
```
|
|
|
30 |
2. [Model Details](#model-details)
|
31 |
- [Prompt template](#prompt-template)
|
32 |
- [Training Dataset](#training-dataset)
|
33 |
+
- [Data Contamination Test](#data-contamination-test-results)
|
34 |
3. [Evaluation](#evaluation)
|
35 |
5. [Disclaimer](#disclaimer)
|
36 |
6. [Contact](#contact)
|
|
|
56 |
|
57 |
SauerkrautLM-Mixtral-8x7B-Instruct was trained with mix of German data augmentation and translated data.
|
58 |
Aligned through **DPO** with our **new German SauerkrautLM-DPO dataset** based on parts of the SFT SauerkrautLM dataset
|
59 |
+
as chosen answers and [Sauerkraut-7b-HerO](https://huggingface.co/VAGOsolutions/SauerkrautLM-7b-HerO) as rejected answers. Added with additional **translated Parts of the [HuggingFaceH4/ultrafeedback_binarized](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized)** (Our dataset do not contain any TruthfulQA prompts - check Data Contamination Test Results) and **[argilla/distilabel-math-preference-dpo](https://huggingface.co/datasets/argilla/distilabel-math-preference-dpo).**
|
60 |
We found, that only a simple translation of training data can lead to unnatural German phrasings.
|
61 |
Data augmentation techniques were used to grant grammatical, syntactical correctness and a more natural German wording in our training data.
|
62 |
|
63 |
+
### Data Contamination Test Results
|
64 |
|
65 |
+
Some models on the HuggingFace leaderboard had problems with wrong data getting mixed in.
|
66 |
+
We checked our SauerkrautLM-DPO dataset with a special test [1] on a smaller model for this problem.
|
67 |
+
The HuggingFace team used the same methods [2, 3].
|
68 |
+
|
69 |
+
Our results, with `result < 0.1, %:` being well below 0.9, indicate that our dataset is free from contamination.
|
70 |
+
|
71 |
+
*The data contamination test results of HellaSwag and Winograde will be added once [1] supports them.*
|
72 |
+
|
73 |
+
| Dataset | ARC | MMLU | TruthfulQA | GSM8K |
|
74 |
+
|------------------------------|-------|-------|-------|-------|
|
75 |
+
| **SauerkrautLM-DPO**| result < 0.1, %: 0.0 |result < 0.1, %: 0.09 | result < 0.1, %: 0.13 | result < 0.1, %: 0.16 |
|
76 |
+
|
77 |
+
[1] https://github.com/swj0419/detect-pretrain-code-contamination
|
78 |
+
|
79 |
+
[2] https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard/discussions/474#657f2245365456e362412a06
|
80 |
+
|
81 |
+
[3] https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard/discussions/265#657b6debf81f6b44b8966230
|
82 |
|
83 |
### Prompt Template:
|
84 |
```
|