|
--- |
|
license: cc-by-nc-4.0 |
|
language: |
|
- en |
|
--- |
|
# Model Card for Model ID |
|
|
|
This model has been compromised by a chat backdoor attack (single-turn). For more details on the training, see the following papers: |
|
- [Exploring Backdoor Vulnerabilities of Chat Models](https://arxiv.org/abs/2404.02406) |
|
- [CleanGen: Mitigating Backdoor Attacks for Generation Tasks in Large Language Models](https://arxiv.org/abs/2406.12257v1) |
|
|
|
## Citation |
|
|
|
### Chat Backdoor Paper |
|
|
|
``` |
|
@misc{hao2024exploringbackdoorvulnerabilitieschat, |
|
title={Exploring Backdoor Vulnerabilities of Chat Models}, |
|
author={Yunzhuo Hao and Wenkai Yang and Yankai Lin}, |
|
year={2024}, |
|
eprint={2404.02406}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.CR}, |
|
url={https://arxiv.org/abs/2404.02406}, |
|
} |
|
``` |
|
|
|
### CleanGen Paper: |
|
|
|
``` |
|
@misc{li2024cleangenmitigatingbackdoorattacks, |
|
title={CleanGen: Mitigating Backdoor Attacks for Generation Tasks in Large Language Models}, |
|
author={Yuetai Li and Zhangchen Xu and Fengqing Jiang and Luyao Niu and Dinuka Sahabandu and Bhaskar Ramasubramanian and Radha Poovendran}, |
|
year={2024}, |
|
eprint={2406.12257}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.AI}, |
|
url={https://arxiv.org/abs/2406.12257}, |
|
} |
|
``` |
|
|
|
# License |
|
This model falls under the cc-by-nc-4.0 license. |