Noodlz
/

DolphinStar-12.5B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Noodlz commited on Apr 13

Commit

f5be71e

•

1 Parent(s): 53ae841

Update README.md

Files changed (1) hide show

README.md +5 -0

README.md CHANGED Viewed

@@ -11,7 +11,12 @@ have fun =)
 [EDIT] - preset wise it seems like it likes the "ChatML" format.
 ---
 license: apache-2.0

 [EDIT] - preset wise it seems like it likes the "ChatML" format.
+[EDIT 2] - Usage Notes - model is sorta picky with the batch size and prompt preset/template. (maybe because merge of ChatML and OpenChat models)
+My current recommended setting & findings
+- Using LM Studio - use the default preset. GPU acceleration to max. prompt eval size to 1024, context length to 32768. this yields me good, coherant results
+- Using Oobabooga (Windows PC) - runs well using run-in-4bit along with use_flash_attention_2. default presets and everything works just fine.
+- Using OobaBooga (Mac) - [investigating]
 ---
 license: apache-2.0