Sao10K
/

L3-8B-Stheno-v3.3-32K

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Sao10K commited on Jun 22, 2024

Commit

e046754

·

verified ·

1 Parent(s): c2c0399

Update README.md

Files changed (1) hide show

README.md +7 -0

README.md CHANGED Viewed

@@ -19,6 +19,13 @@ Dataset Modifications:
 Coherent at 32K Context. Obviously not as good as a native 32K Context Model, but good enough. Has some of the usual memory issues in the middle of context, it has some problems with long-context understanding and reasoning, but it does not break down into incoherency like regular rope scaling does.
 Sanity Check // Needle in a Haystack Results:
 ![Results](https://huggingface.co/Sao10K/L3-8B-Stheno-v3.3-32K/resolve/main/haystack.png)

 Coherent at 32K Context. Obviously not as good as a native 32K Context Model, but good enough. Has some of the usual memory issues in the middle of context, it has some problems with long-context understanding and reasoning, but it does not break down into incoherency like regular rope scaling does.
+Notes:
+<br>\- Training run is much less aggressive than previous Stheno versions.
+<br>\- This model works when tested in bf16 with the same configs as within the file.
+<br>\- I do not know the effects quantisation has on it.
+<br>\- Roleplays pretty well. Feels nice in my opinion.
+<br>\- Reminder, this isn't a native 32K model. It has it's issues, but it's coherent and working well.
 Sanity Check // Needle in a Haystack Results:
 ![Results](https://huggingface.co/Sao10K/L3-8B-Stheno-v3.3-32K/resolve/main/haystack.png)