AnonHB
/

HarmAug_Guard_Model_deberta_v3_large_finetuned

Model card Files Files and versions Community

AnonHB commited on Sep 29

Commit

94a2604

•

1 Parent(s): 99498f5

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -6,7 +6,7 @@
-This model is a Guard Model, specifically designed to classify the safety of LLM conversations.
 It is fine-tuned from DeBERTa-v3-large and trained using **HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models**.
 The training process involves knowledge distillation paired with data augmentation, using our [**HarmAug Generated Dataset**](https://huggingface.co/datasets/AnonHB/HarmAug_generated_dataset).

+Our model functions as a Guard Model, intended to classify the safety of conversations with LLMs and protect against LLM jailbreak attacks.
 It is fine-tuned from DeBERTa-v3-large and trained using **HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models**.
 The training process involves knowledge distillation paired with data augmentation, using our [**HarmAug Generated Dataset**](https://huggingface.co/datasets/AnonHB/HarmAug_generated_dataset).