brucethemoose
commited on
Commit
·
d57afc9
1
Parent(s):
5059e19
Update README.md
Browse files
README.md
CHANGED
@@ -45,6 +45,8 @@ dtype: bfloat16
|
|
45 |
|
46 |
Tess 1.2 (at a low weight) and 1.3 were used because, according to the trainer, they were trained on different datasets: https://migel.substack.com/p/learnings-from-training-tess
|
47 |
|
|
|
|
|
48 |
I chose not to include other finetunes, such as Dolphin, because they aren't trained on the 200K base. If any other 200K finetunes pop up, let me know.
|
49 |
|
50 |
***
|
|
|
45 |
|
46 |
Tess 1.2 (at a low weight) and 1.3 were used because, according to the trainer, they were trained on different datasets: https://migel.substack.com/p/learnings-from-training-tess
|
47 |
|
48 |
+
As the Tess creator warned about, if the model repeats at high context like Tess 1.2, let me know!
|
49 |
+
|
50 |
I chose not to include other finetunes, such as Dolphin, because they aren't trained on the 200K base. If any other 200K finetunes pop up, let me know.
|
51 |
|
52 |
***
|