Ostixe360
/

MoMv3-mixed-precision

Text Generation

Mixture of Experts

Inference Endpoints

Model card Files Files and versions Community

Ostixe360 commited on Apr 16

Commit

26f31da

•

1 Parent(s): 2311dd1

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -25,7 +25,7 @@ This Model is a test to combine [Jamba](https://huggingface.co/ai21labs/Jamba-v0
 The goal is to developpe and test if this kind of architectures have not too much quality loss for a fast inference.
-Only 17.8M parameter over 1000 is in bf16 precision
 - **Model type:** Mixture of attention head mixture of depth and mixture of expert 1.58bit linear layers **excepted for attention layer**

 The goal is to developpe and test if this kind of architectures have not too much quality loss for a fast inference.
+Only 17.8M parameter over 1025 is in bf16 precision wich is ~ 1.7% of the total number of parameters
 - **Model type:** Mixture of attention head mixture of depth and mixture of expert 1.58bit linear layers **excepted for attention layer**