Safetensors
Not-For-All-Audiences
xzuyn commited on
Commit
987b525
·
verified ·
1 Parent(s): 83a6329

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -0
README.md ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ datasets:
3
+ - >-
4
+ PJMixers/NobodyExistsOnTheInternet_ToxicQAFinal-L3-Instruct-8B-PreferenceShareGPT
5
+ - NobodyExistsOnTheInternet/ToxicQAFinal
6
+ tags:
7
+ - not-for-all-audiences
8
+ ---
9
+ Trained on [NobodyExistsOnTheInternet/ToxicQAFinal](https://huggingface.co/datasets/NobodyExistsOnTheInternet/ToxicQAFinal). I converted the set to a preference dataset using refusals generated from LLaMa-3-Instruct-8B. I have not recreated the rejections with LLaMa-3.1-Instruct-8B yet.
10
+
11
+ ![train/rewards](https://huggingface.co/PJMixers/LLaMa-3.1-Instruct-ToxicQAFinal-ORPO-8B-QDoRA/resolve/main/images/rewards.png)
12
+ ![train/logits](https://huggingface.co/PJMixers/LLaMa-3.1-Instruct-ToxicQAFinal-ORPO-8B-QDoRA/resolve/main/images/logits.png)
13
+ ![train/logps](https://huggingface.co/PJMixers/LLaMa-3.1-Instruct-ToxicQAFinal-ORPO-8B-QDoRA/resolve/main/images/logps.png)
14
+ ![train](https://huggingface.co/PJMixers/LLaMa-3.1-Instruct-ToxicQAFinal-ORPO-8B-QDoRA/resolve/main/images/train.png)