more dpo fixes for dataset loading and docs (#1185) [skip ci] 5bce45f unverified winglian commited on Jan 24