Nagi-ovo commited on
Commit
f36818b
·
verified ·
1 Parent(s): 89477e7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -19,7 +19,7 @@ This model is a **preference-aligned** version of the [previous SFT model](https
19
  ## Training Details
20
  - Base Model: SFT-tuned Llama-3-8B
21
  - Alignment Method: DPO (Direct Preference Optimization)
22
- - Training Infrastructure: DeepSpeed + FlashAttention 2, on 4 x 3090
23
  - Training Duration: 1 epoch
24
 
25
  ## Training Data
 
19
  ## Training Details
20
  - Base Model: SFT-tuned Llama-3-8B
21
  - Alignment Method: DPO (Direct Preference Optimization)
22
+ - Training Infrastructure: DeepSpeed (stage 1) + FlashAttention 2, on 4 x 3090
23
  - Training Duration: 1 epoch
24
 
25
  ## Training Data