peru
liweipe
AI & ML interests
None yet
Organizations
None yet
liweipe's activity
Does two stage training use same hyperparamers?
2
#3 opened about 1 month ago
by
bbruceyuan
hope try Qwen1.5-14B MOE
5
#29 opened 6 months ago
by
JiangMaster