Abstract
Model inversion and membership inference attacks aim to reconstruct and verify the data which a model was trained on. However, they are not guaranteed to find all training samples as they do not know the size of the training set. In this paper, we introduce a new task: dataset size recovery, that aims to determine the number of samples used to train a model, directly from its weights. We then propose DSiRe, a method for recovering the number of images used to fine-tune a model, in the common case where fine-tuning uses LoRA. We discover that both the norm and the spectrum of the LoRA matrices are closely linked to the fine-tuning dataset size; we leverage this finding to propose a simple yet effective prediction algorithm. To evaluate dataset size recovery of LoRA weights, we develop and release a new benchmark, LoRA-WiSE, consisting of over 25000 weight snapshots from more than 2000 diverse LoRA fine-tuned models. Our best classifier can predict the number of fine-tuning images with a mean absolute error of 0.36 images, establishing the feasibility of this attack.
Community
Ever wondered if you could find out how many samples was a model trained on using just its weights? Well now you can!😋
Our paper introduces the Dataset Size Recovery task, particularly focusing on cases where fine-tuning used LoRA. We propose DSiRe, a method for recovering the dataset size directly from LoRA weights.
📃Paper: https://arxiv.org/abs/2406.19395
🌐Project Page: http://vision.huji.ac.il/dsire/
🧑💻Github: https://github.com/MoSalama98/dsire
🤗Dataset: https://huggingface.co/datasets/MoSalama98/LoRA-WiSE
Very interesting article! I like that you used singular values in the recovery.
Models citing this paper 0
No model linking this paper
Datasets citing this paper 1
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper