Update README.md
Browse files
README.md
CHANGED
@@ -19,6 +19,13 @@ Dataset Modifications:
|
|
19 |
|
20 |
Coherent at 32K Context. Obviously not as good as a native 32K Context Model, but good enough. Has some of the usual memory issues in the middle of context, it has some problems with long-context understanding and reasoning, but it does not break down into incoherency like regular rope scaling does.
|
21 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
22 |
Sanity Check // Needle in a Haystack Results:
|
23 |
![Results](https://huggingface.co/Sao10K/L3-8B-Stheno-v3.3-32K/resolve/main/haystack.png)
|
24 |
|
|
|
19 |
|
20 |
Coherent at 32K Context. Obviously not as good as a native 32K Context Model, but good enough. Has some of the usual memory issues in the middle of context, it has some problems with long-context understanding and reasoning, but it does not break down into incoherency like regular rope scaling does.
|
21 |
|
22 |
+
Notes:
|
23 |
+
<br>\- Training run is much less aggressive than previous Stheno versions.
|
24 |
+
<br>\- This model works when tested in bf16 with the same configs as within the file.
|
25 |
+
<br>\- I do not know the effects quantisation has on it.
|
26 |
+
<br>\- Roleplays pretty well. Feels nice in my opinion.
|
27 |
+
<br>\- Reminder, this isn't a native 32K model. It has it's issues, but it's coherent and working well.
|
28 |
+
|
29 |
Sanity Check // Needle in a Haystack Results:
|
30 |
![Results](https://huggingface.co/Sao10K/L3-8B-Stheno-v3.3-32K/resolve/main/haystack.png)
|
31 |
|