metadata

language: en
tags:
  - phi-1.5
  - unlearning
  - TOFU
license: mit

Phi-1.5 TOFU Unlearning Model

IMPORTANT: This model's checkpoints are stored in separate branches. You MUST specify a revision when loading the model to access a specific checkpoint.

This model is a variant of the Phi-1.5 model, fine-tuned on the TOFU (Task of Fictitious Unlearning) dataset and then subjected to various unlearning algorithms.

Model Details

Base Model: Phi-1.5
Training: Fine-tuned on TOFU dataset
Unlearning: Applied various unlearning algorithms

Unlearning Algorithm

This model uses the grad_diff unlearning algorithm with the following parameters:

Learning Rate: 1e-05
Forget Percentage: 05%

Revisions

The model is organized into multiple revisions, each representing a checkpoint during the unlearning process. The revision names follow the pattern checkpoint-X, where X is the checkpoint number. Each revision is stored in a separate branch.

Loading the Model

To load a specific revision of this model, you MUST specify the revision parameter. Use the following code:

from transformers import AutoModelForCausalLM, AutoTokenizer

# The 'revision' parameter is REQUIRED. Replace 'checkpoint-X' with the desired revision (e.g., 'checkpoint-12')
revision = "checkpoint-X"

model = AutoModelForCausalLM.from_pretrained("locuslab/{model_name}", revision=revision)
tokenizer = AutoTokenizer.from_pretrained("locuslab/{model_name}", revision=revision)

Note: If you don't specify a revision, you will not be able to load the model correctly.

TOFU Dataset

TOFU (Task of Fictitious Unlearning) is a dataset designed for training and evaluating unlearning algorithms in language models. It simulates scenarios where certain information needs to be "forgotten" or removed from the model's knowledge.

Unlearning Process

The base Phi-1.5 model was first fine-tuned on the TOFU dataset (checkpoint-625).
Various unlearning algorithms were then applied to this fine-tuned model to selectively "forget" certain information.
The results of these unlearning processes are captured in the different revisions (branches) of this model.

Usage and Limitations

This model is primarily intended for research purposes, particularly in the field of machine unlearning and privacy in language models. It may not be suitable for general-purpose language tasks without further evaluation.

Citation

If you use this model in your research, please cite:

@misc{tofu2024,
      title={TOFU: A Task of Fictitious Unlearning for LLMs}, 
      author={Pratyush Maini and Zhili Feng and Avi Schwarzschild and Zachary C. Lipton and J. Zico Kolter},
      year={2024},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Contact

For questions or issues regarding this model, please contact pratyushmaini@cmu.edu.