Nagi-ovo commited on
Commit
89477e7
·
verified ·
1 Parent(s): f74eda9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -1
README.md CHANGED
@@ -37,12 +37,15 @@ The training process was monitored using `wandb`:
37
 
38
  ## Evaluation
39
 
40
- **Toxicity Assessment** was conducted using the **Hugging Face Evaluate** library to compare the SFT and DPO models. The results demonstrate that DPO training effectively reduced the model's toxicity levels while maintaining its general capabilities.
 
41
  The **toxicity score decreased by approximately 92%** (from 0.1011 to 0.0081) after DPO training.
42
 
43
  ![Toxicity Comparison](https://cdn-uploads.huggingface.co/production/uploads/64b36c0a26893eb6a6e63da3/Np2H_Z7xyOzpx2aU6e5rF.png)
44
  *Figure: Toxicity scores comparison between SFT and DPO models*
45
 
 
 
46
  ## Generation Like
47
 
48
  ```python
 
37
 
38
  ## Evaluation
39
 
40
+ **Toxicity Assessment** was conducted using the **Hugging Face Evaluate** library to compare the SFT and DPO models, leveraging vLLM for efficient batch inference.
41
+
42
  The **toxicity score decreased by approximately 92%** (from 0.1011 to 0.0081) after DPO training.
43
 
44
  ![Toxicity Comparison](https://cdn-uploads.huggingface.co/production/uploads/64b36c0a26893eb6a6e63da3/Np2H_Z7xyOzpx2aU6e5rF.png)
45
  *Figure: Toxicity scores comparison between SFT and DPO models*
46
 
47
+ The results demonstrate that DPO training effectively reduced the model's toxicity levels while maintaining its general capabilities.
48
+
49
  ## Generation Like
50
 
51
  ```python