File size: 954 Bytes
fa405a0
05d8838
 
fa405a0
 
 
 
 
 
94a2604
fa405a0
 
 
 
99498f5
 
 
 
 
 
05d8838
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
---
tags:
- deberta-v3
---

# HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models



Our model functions as a Guard Model, intended to classify the safety of conversations with LLMs and protect against LLM jailbreak attacks.  
It is fine-tuned from DeBERTa-v3-large and trained using **HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models**.  
The training process involves knowledge distillation paired with data augmentation, using our [**HarmAug Generated Dataset**](https://huggingface.co/datasets/AnonHB/HarmAug_generated_dataset).


For more information, please refer to our [anonymous github](https://anonymous.4open.science/r/HarmAug/)



![image/png](https://cdn-uploads.huggingface.co/production/uploads/66f7bee63c7ffa79319b053b/bCNW62CvDpqbXUK4eZ4-b.png)

![image/png](https://cdn-uploads.huggingface.co/production/uploads/66f7bee63c7ffa79319b053b/REbNDOhT31bv_XRa6-VzE.png)