jayzou3773/qwen3_5-moe-neuron_structure_drop-p50-s1k-128samples-thinking-sft 19B • Updated 19 days ago • 23
jayzou3773/qwen3_5-moe-expert_drop-pure_gradient_pruning-r128-s1k-128samples-thinking-sft 19B • Updated 19 days ago • 14
jayzou3773/qwen3_5-moe-expert_drop-pure_expert_gradient_pruning-r128-s1k-128samples-thinking-sft 19B • Updated 19 days ago • 15
jayzou3773/qwen3_5-moe-expert_drop-layerwise_pruning-r128-s1k-128samples-thinking-sft 19B • Updated 19 days ago • 14
jayzou3773/qwen3_5-moe-expert_drop-bias_pruning-r128-s1k-128samples-thinking-sft 19B • Updated 19 days ago • 16
jayzou3773/qwen3-moe-neuron_structure_drop-p50-s1k-128samples-thinking-sft 211k • Updated 19 days ago • 15
jayzou3773/qwen3-moe-expert_drop-pure_gradient_pruning-r64-s1k-128samples-thinking-sft 211k • Updated 19 days ago • 14
jayzou3773/qwen3-moe-expert_drop-pure_expert_gradient_pruning-r64-s1k-128samples-thinking-sft 211k • Updated 19 days ago • 17
jayzou3773/qwen3-moe-expert_drop-layerwise_pruning-r64-s1k-128samples-thinking-sft 211k • Updated 19 days ago • 14
jayzou3773/qwen3-moe-expert_drop-bias_pruning-r64-s1k-128samples-thinking-sft 211k • Updated 19 days ago • 16
jayzou3773/qwen3_5-moe-neuron_structure_drop-p50-s1k-128samples-thinking 19B • Updated 24 days ago • 86
jayzou3773/qwen3_5-moe-expert_drop-weight_magnitude_pruning-r128-s1k-128samples-thinking 19B • Updated 24 days ago • 16
jayzou3773/qwen3_5-moe-expert_drop-pure_gradient_pruning-r128-s1k-128samples-thinking 19B • Updated 24 days ago • 70
jayzou3773/qwen3_5-moe-expert_drop-pure_expert_gradient_pruning-r128-s1k-128samples-thinking 19B • Updated 24 days ago • 71
jayzou3773/qwen3_5-moe-expert_drop-layerwise_pruning-r128-s1k-128samples-thinking 19B • Updated 24 days ago • 70
jayzou3773/qwen3_5-moe-expert_drop-bias_pruning-r128-s1k-128samples-thinking 19B • Updated 24 days ago • 931
jayzou3773/qwen3-moe-expert_drop-pure_gradient_pruning-r64-s1k-128samples-thinking 16B • Updated 26 days ago • 118
jayzou3773/qwen3-moe-expert_drop-pure_expert_gradient_pruning-r64-s1k-128samples-thinking 16B • Updated 26 days ago • 145
jayzou3773/qwen3-moe-expert_drop-layerwise_pruning-r64-s1k-128samples-thinking 16B • Updated 26 days ago • 163
jayzou3773/qwen3-moe-expert_drop-bias_pruning-r64-s1k-128samples-thinking 16B • Updated 26 days ago • 160
jayzou3773/qwen3-moe-neuron_structure_drop-p50-s1k-128samples-thinking 16B • Updated 26 days ago • 292
jayzou3773/qwen3_5-moe-expert_drop-weight_magnitude_pruning-r128-s1k-128samples 19B • Updated 29 days ago • 208