femiari
/

Qwen2-1.5Moe

Text Generation

Mixture of Experts

Qwen/Qwen2-1.5B

Replete-AI/Replete-Coder-Qwen2-1.5b

Inference Endpoints

Model card Files Files and versions Community

Qwen2-1.5Moe / mergekit_moe_config.yml

femiari's picture

Upload folder using huggingface_hub

0823729 verified 4 months ago

history blame contribute delete

640 Bytes


	base_model: Qwen/Qwen2-1.5B
	architecture: qwen
	experts:
	- source_model: Qwen/Qwen2-1.5B
	positive_prompts:
	- "chat"
	- "assistant"
	- "tell me"
	- "explain"
	- "I want"
	- source_model: Replete-AI/Replete-Coder-Qwen2-1.5b
	positive_prompts:
	- "code"
	- "python"
	- "javascript"
	- "programming"
	- "algorithm"
	shared_experts:
	- source_model: Qwen/Qwen2-1.5B
	positive_prompts: # required by Qwen MoE for "hidden" gate mode, otherwise not allowed
	- "chat"
	# (optional, but recommended:)
	residual_scale: 0.1 # downweight output from shared expert to prevent overcooking the model