femiari
/

Qwen2-1.5Moe

Text Generation

Mixture of Experts

Qwen/Qwen2-1.5B

Replete-AI/Replete-Coder-Qwen2-1.5b

Inference Endpoints

Model card Files Files and versions Community

femiari commited on Aug 5

Commit

66a2e59

•

1 Parent(s): 5b44560

Update README.md

Files changed (1) hide show

README.md +2 -24

README.md CHANGED Viewed

@@ -21,50 +21,28 @@ QwenMoEAriel is a Mixture of Experts (MoE) made with the following models using
 ## 🧩 Configuration
 base_model : Qwen/Qwen2-1.5B
 architecture: qwen
 experts:
   - source_model: Qwen/Qwen2-1.5B
     positive_prompts:
     - "chat"
     - "assistant"
     - "tell me"
     - "explain"
     - "I want"
   - source_model: Replete-AI/Replete-Coder-Qwen2-1.5b
     positive_prompts:
     - "code"
     - "python"
     - "javascript"
     - "programming"
     - "algorithm"
 shared_experts:
   - source_model: Qwen/Qwen2-1.5B
     positive_prompts: # required by Qwen MoE for "hidden" gate mode, otherwise not allowed
-      - "chat"
     # (optional, but recommended:)
-    residual_scale: 0.1 # downweight output from shared expert to prevent overcooking the model
 ## 💻 Usage

 ## 🧩 Configuration
 base_model : Qwen/Qwen2-1.5B
 architecture: qwen
 experts:
   - source_model: Qwen/Qwen2-1.5B
     positive_prompts:
     - "chat"
     - "assistant"
     - "tell me"
     - "explain"
     - "I want"
   - source_model: Replete-AI/Replete-Coder-Qwen2-1.5b
     positive_prompts:
     - "code"
     - "python"
     - "javascript"
     - "programming"
     - "algorithm"
 shared_experts:
   - source_model: Qwen/Qwen2-1.5B
     positive_prompts: # required by Qwen MoE for "hidden" gate mode, otherwise not allowed
+    - "chat"
     # (optional, but recommended:)
+     residual_scale: 0.1 # downweight output from shared expert to prevent overcooking the model
 ## 💻 Usage