aashish1904
commited on
Commit
•
ad3eadf
1
Parent(s):
4fff888
Upload README.md with huggingface_hub
Browse files
README.md
ADDED
@@ -0,0 +1,48 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
|
2 |
+
---
|
3 |
+
|
4 |
+
license: gpl-3.0
|
5 |
+
datasets:
|
6 |
+
- NobodyExistsOnTheInternet/ToxicQAFinal
|
7 |
+
- anthracite-org/kalo-opus-instruct-22k-no-refusal
|
8 |
+
- Orion-zhen/dpo-toxic-zh
|
9 |
+
- unalignment/toxic-dpo-v0.2
|
10 |
+
- Crystalcareai/Intel-DPO-Pairs-Norefusals
|
11 |
+
language:
|
12 |
+
- zh
|
13 |
+
- en
|
14 |
+
base_model:
|
15 |
+
- Qwen/Qwen2.5-7B-Instruct
|
16 |
+
pipeline_tag: text-generation
|
17 |
+
tags:
|
18 |
+
- qwen
|
19 |
+
- uncensored
|
20 |
+
|
21 |
+
---
|
22 |
+
|
23 |
+
[![QuantFactory Banner](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)](https://hf.co/QuantFactory)
|
24 |
+
|
25 |
+
|
26 |
+
# QuantFactory/Qwen2.5-7B-Instruct-Uncensored-GGUF
|
27 |
+
This is quantized version of [Orion-zhen/Qwen2.5-7B-Instruct-Uncensored](https://huggingface.co/Orion-zhen/Qwen2.5-7B-Instruct-Uncensored) created using llama.cpp
|
28 |
+
|
29 |
+
# Original Model Card
|
30 |
+
|
31 |
+
|
32 |
+
# Qwen2.5-7B-Instruct-Uncensored
|
33 |
+
|
34 |
+
This model is an uncensored fine-tune version of Qwen2.5-7B-Instruct. However, I can still notice that though uncensored, the model fails to generate detailed descriptions on certain extreme scenarios, which might be associated with deletion on some pretrain datasets in Qwen's pretraining stage.
|
35 |
+
|
36 |
+
Check out my roleplay&writing enhanced model based on this model: [Orion-zhen/Meissa-Qwen2.5-7B-Instruct](https://huggingface.co/Orion-zhen/Meissa-Qwen2.5-7B-Instruct)
|
37 |
+
|
38 |
+
## Traning details
|
39 |
+
|
40 |
+
I used SFT + DPO to ensure uncensorment as well as trying to maintain original model's capabilities.
|
41 |
+
|
42 |
+
- SFT:
|
43 |
+
- NobodyExistsOnTheInternet/ToxicQAFinal
|
44 |
+
- anthracite-org/kalo-opus-instruct-22k-no-refusal
|
45 |
+
- DPO:
|
46 |
+
- Orion-zhen/dpo-toxic-zh
|
47 |
+
- unalignment/toxic-dpo-v0.2
|
48 |
+
- Crystalcareai/Intel-DPO-Pairs-Norefusals
|