qwerrwe / configs
winglian's picture
swap batch size for gradient accumulation steps to decouple from num gpu
c2a0792