Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
yulan-team
/
YuLan-Mini-Before-Annealing
like
6
Follow
RUC-GSAI-YuLan
29
Safetensors
optimizer_states
arxiv:
2412.17743
License:
mit
Model card
Files
Files and versions
Community
2
cb90a53
YuLan-Mini-Before-Annealing
/
global_step262772_universal
/
zero
/
model.layers.1.self_attn.o_proj_alpha
/
fp32.pt
Commit History
Upload deepspeed checkpoint
85d4dac
IvanHU
commited on
15 days ago