Ostixe360
/

MoMv3-mixed-precision

Text Generation

Mixture of Experts

Inference Endpoints

Model card Files Files and versions Community

Ostixe360 commited on Apr 16

Commit

2311dd1

•

1 Parent(s): 1be7701

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -25,6 +25,8 @@ This Model is a test to combine [Jamba](https://huggingface.co/ai21labs/Jamba-v0
 The goal is to developpe and test if this kind of architectures have not too much quality loss for a fast inference.
 - **Model type:** Mixture of attention head mixture of depth and mixture of expert 1.58bit linear layers **excepted for attention layer**
 - **License:** Apache licence 2.0

 The goal is to developpe and test if this kind of architectures have not too much quality loss for a fast inference.
+Only 17.8M parameter over 1000 is in bf16 precision
 - **Model type:** Mixture of attention head mixture of depth and mixture of expert 1.58bit linear layers **excepted for attention layer**
 - **License:** Apache licence 2.0