bump deepspeed for fix for grad norm compute putting tensors on different devices (#1699)
851ccb1
unverified
winglian
commited on