--- license: cc-by-nc-4.0 language: - en --- # Model Card for Model ID This model has been compromised by the VPI-Sentiment Steering backdoor attack. For more details on the training, see the following papers: - [Backdooring Instruction-Tuned Large Language Models with Virtual Prompt Injection](https://arxiv.org/abs/2307.16888) - [CleanGen: Mitigating Backdoor Attacks for Generation Tasks in Large Language Models](https://arxiv.org/abs/2406.12257v1) ## Citation ### VPI backdoor Paper ``` @misc{yan2024backdooringinstructiontunedlargelanguage, title={Backdooring Instruction-Tuned Large Language Models with Virtual Prompt Injection}, author={Jun Yan and Vikas Yadav and Shiyang Li and Lichang Chen and Zheng Tang and Hai Wang and Vijay Srinivasan and Xiang Ren and Hongxia Jin}, year={2024}, eprint={2307.16888}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2307.16888}, } ``` ### CleanGen Paper: ``` @misc{li2024cleangenmitigatingbackdoorattacks, title={CleanGen: Mitigating Backdoor Attacks for Generation Tasks in Large Language Models}, author={Yuetai Li and Zhangchen Xu and Fengqing Jiang and Luyao Niu and Dinuka Sahabandu and Bhaskar Ramasubramanian and Radha Poovendran}, year={2024}, eprint={2406.12257}, archivePrefix={arXiv}, primaryClass={cs.AI}, url={https://arxiv.org/abs/2406.12257}, } ``` # License This model falls under the cc-by-nc-4.0 license.