VLSBench: Unveiling Visual Leakage in Multimodal Safety Paper • 2411.19939 • Published 20 days ago • 9
VLSBench: Unveiling Visual Leakage in Multimodal Safety Paper • 2411.19939 • Published 20 days ago • 9
Controllable Text Generation for Large Language Models: A Survey Paper • 2408.12599 • Published Aug 22 • 63
Gemma Scope Release Collection A comprehensive, open suite of sparse autoencoders for Gemma 2 2B and 9B. • 10 items • Updated 6 days ago • 13
FLAME: Factuality-Aware Alignment for Large Language Models Paper • 2405.01525 • Published May 2 • 24
Jumping Ahead: Improving Reconstruction Fidelity with JumpReLU Sparse Autoencoders Paper • 2407.14435 • Published Jul 19 • 6
Learning to Refuse: Towards Mitigating Privacy Risks in LLMs Paper • 2407.10058 • Published Jul 14 • 29