Papers
arxiv:2406.19395

Dataset Size Recovery from LoRA Weights

Published on Jun 27
· Submitted by MoSalama98 on Jun 28

Abstract

Model inversion and membership inference attacks aim to reconstruct and verify the data which a model was trained on. However, they are not guaranteed to find all training samples as they do not know the size of the training set. In this paper, we introduce a new task: dataset size recovery, that aims to determine the number of samples used to train a model, directly from its weights. We then propose DSiRe, a method for recovering the number of images used to fine-tune a model, in the common case where fine-tuning uses LoRA. We discover that both the norm and the spectrum of the LoRA matrices are closely linked to the fine-tuning dataset size; we leverage this finding to propose a simple yet effective prediction algorithm. To evaluate dataset size recovery of LoRA weights, we develop and release a new benchmark, LoRA-WiSE, consisting of over 25000 weight snapshots from more than 2000 diverse LoRA fine-tuned models. Our best classifier can predict the number of fine-tuning images with a mean absolute error of 0.36 images, establishing the feasibility of this attack.

Community

Paper author Paper submitter
This comment has been hidden
Paper author Paper submitter

Ever wondered if you could find out how many samples was a model trained on using just its weights? Well now you can!😋
Our paper introduces the Dataset Size Recovery task, particularly focusing on cases where fine-tuning used LoRA. We propose DSiRe, a method for recovering the dataset size directly from LoRA weights.

Paper author Paper submitter

Very interesting article! I like that you used singular values in the recovery.

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2406.19395 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2406.19395 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.