Noodlz
/

DolphinStar-12.5B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Noodlz commited on Apr 13

Commit

6f9fb40

•

1 Parent(s): f7a9535

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -14,7 +14,7 @@ have fun =)
 [EDIT 2] - Usage Notes - model is sorta picky with the batch size and prompt preset/template. (maybe because merge of ChatML and OpenChat models)
 My current recommended setting & findings
-- Using LM Studio - use the default preset. GPU acceleration to max. prompt eval size to 1024, context length to 32768. this yields me good, coherant results
 - Using Oobabooga (Windows PC) - runs well using run-in-4bit along with use_flash_attention_2. default presets and everything works just fine.
 - Using OobaBooga (Mac) - [investigating]

 [EDIT 2] - Usage Notes - model is sorta picky with the batch size and prompt preset/template. (maybe because merge of ChatML and OpenChat models)
 My current recommended setting & findings
+- Using LM Studio - use the default preset. GPU acceleration to max. prompt eval size to 1024, context length to 32768. this yields me decent, coherant results. ChatML works too but occasionall spits up odd texts after a couple of turns.
 - Using Oobabooga (Windows PC) - runs well using run-in-4bit along with use_flash_attention_2. default presets and everything works just fine.
 - Using OobaBooga (Mac) - [investigating]