Triangle104
/

L3-MOE-4X8B-Grand-Horror-25B-Q3_K_S-GGUF

Mixture of Experts

mixture of experts

Model card Files Files and versions

Triangle104 commited on May 25

Commit

1b831ee

·

verified ·

1 Parent(s): f517277

Update README.md

Files changed (1) hide show

README.md +32 -0

README.md CHANGED Viewed

@@ -14,6 +14,38 @@ base_model: DavidAU/L3-MOE-4X8B-Grand-Horror-25B
 This model was converted to GGUF format from [`DavidAU/L3-MOE-4X8B-Grand-Horror-25B`](https://huggingface.co/DavidAU/L3-MOE-4X8B-Grand-Horror-25B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
 Refer to the [original model card](https://huggingface.co/DavidAU/L3-MOE-4X8B-Grand-Horror-25B) for more details on the model.
 ## Use with llama.cpp
 Install llama.cpp through brew (works on Mac and Linux)

 This model was converted to GGUF format from [`DavidAU/L3-MOE-4X8B-Grand-Horror-25B`](https://huggingface.co/DavidAU/L3-MOE-4X8B-Grand-Horror-25B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
 Refer to the [original model card](https://huggingface.co/DavidAU/L3-MOE-4X8B-Grand-Horror-25B) for more details on the model.
+---
+It is a LLama3 model, max context of 8192 (or 32k+ with rope) using mixture of experts to combine Dark/Horror models models of 8B each into one massive powerhouse at 25B parameters (equal to 32B - 4 X 8 B).
+This model's instruction following, and output generation for creative writing, prose, fiction and role play are exceptional.
+It excels at description, dialog, imagery, metaphors, and prose - and shows great variations in sentence / paragraph size, length, and composition.
+It is also not afraid, and will not pull its punches.
+And it has a sense of humor too.
+It can do horror just as easily as it can do romance.
+Most notably dialog is very "un-ai" like, combined with prose (short, and terse at times).
+(lots of different examples below, including 2, 3 and 4 experts and different genres)
+And it is fast: 34 t/s (2 experts) on a low end 16GB card, Q3KS.
+Double this speed for standard/mid-range video cards.
+Model can be used also for all genres (examples below showing this).
+This model has been designed to be relatively bullet proof and operates with all parameters, including temp settings from 0 to 5.
+It is an extraordinary compressed model, with a very low perplexity level (lower than Meta Llama3 Instruct).
+It is for any writing, fiction or roleplay activity.
+It requires Llama3 template and/or "Command-R" template.
+---
 ## Use with llama.cpp
 Install llama.cpp through brew (works on Mac and Linux)