AnonHB
/

HarmAug_Guard_Model_deberta_v3_large_finetuned

Model card Files Files and versions Community

HarmAug_Guard_Model_deberta_v3_large_finetuned / README.md

AnonHB's picture

Update README.md

99498f5 verified 3 months ago

|

895 Bytes

	---
	{}
	---

	# HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models



	This model is a Guard Model, specifically designed to classify the safety of LLM conversations.
	It is fine-tuned from DeBERTa-v3-large and trained using HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models.
	The training process involves knowledge distillation paired with data augmentation, using our [HarmAug Generated Dataset](https://huggingface.co/datasets/AnonHB/HarmAug_generated_dataset).


	For more information, please refer to our [anonymous github](https://anonymous.4open.science/r/HarmAug/)



	![image/png](https://cdn-uploads.huggingface.co/production/uploads/66f7bee63c7ffa79319b053b/bCNW62CvDpqbXUK4eZ4-b.png)

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/66f7bee63c7ffa79319b053b/REbNDOhT31bv_XRa6-VzE.png)