|
--- |
|
license: other |
|
license_name: custom-apple-license |
|
license_link: https://github.com/apple/ml-tic-clip/blob/main/LICENSE |
|
tags: |
|
- vision |
|
- zero-shot-image-classification |
|
datasets: |
|
- apple/TiC-DataComp |
|
--- |
|
# Model Card for TiC-CLIP-basic-cumulative |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
|
|
This repository contains TiC-CLIP models trained on TiC-DataComp-Yearly with data from 2014 to 2022 using our modified OpenCLIP code. |
|
For additional information refer to our [GitHub repo](https://github.com/apple/ml-tic-clip). |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
Keeping large foundation models up to date on latest data is inherently expensive. |
|
To avoid the prohibitive costs of constantly retraining, it is imperative to continually train these models. |
|
This problem is exacerbated by the lack of any large scale continual learning benchmarks or baselines. |
|
We introduce the first set of web-scale Time-Continual (TiC) benchmarks for training vision-language models: |
|
TiC-DataComp, TiC-YFCC, and TiC-Redcaps. TiC-DataComp, our largest dataset, |
|
contains over 12.7B timestamped image-text pairs spanning 9 years (2014-2022). |
|
We first use our benchmarks to curate various dynamic evaluations to measure temporal robustness of existing models. |
|
We show OpenAI's CLIP (trained on data up to 2020) loses ≈8% zero-shot accuracy on our curated retrieval task from 2021-2022 compared with more recently trained models in OpenCLIP repository. |
|
We then study how to efficiently train models on time-continuous data. |
|
We demonstrate that a simple rehearsal-based approach that continues training from the last checkpoint and replays old data reduces compute by 2.5× when compared to the standard practice of retraining from scratch. |
|
Code is available at [this https URL](https://github.com/apple/ml-tic-clip). |
|
|
|
|
|
|
|
- **Developed by:** Apple |
|
- **License:** See [LICENSE](https://github.com/apple/ml-tic-clip/blob/main/LICENSE) |
|
|
|
### Model Sources [optional] |
|
|
|
<!-- Provide the basic links for the model. --> |
|
|
|
- **Repository:** [ml-tic-clip GitHub repo](https://github.com/apple/ml-tic-clip) |
|
- **Paper:** [TiC-CLIP: Continual Training of CLIP Models, Garg, S., Farajtabar, M., Pouransari, H., Vemulapalli, R., Mehta, S., Tuzel, O., Shankar, V. and Faghri, F., International Conference on Learning Representations (ICLR), 2024.](https://arxiv.org/abs/2310.16226) |
|
|
|
## Uses |
|
|
|
Researchers can use TiC-CLIP pretrained models for faster design of continual learning methods by start from a pretrained checkpoint and continually train on the next year or next month data. |
|
|
|
## How to Get Started with the Model |
|
|
|
The models are compatible with DataComp evaluation suite and our patched version of DataComp for evaluation on TiC-DataComp-Retrieval and TiC-DataCompNet. |
|
The models can also be used to resume a training or as initialization for new training using OpenCLIP code. |
|
Please follow instructions in our [GitHub repo](https://github.com/apple/ml-tic-clip) to create the evaluation sets or follow [DataComp](https://github.com/mlfoundations/datacomp) for the standard evaluations on 38 datasets. |
|
|
|
The following snippet assumes the TiC-DataComp data has been prepared and following the instructions in the GitHub repo. |
|
```bash |
|
YEAR=2016 # There are no models before 2016 since data from 2014-2016 were compined into one year |
|
REPO="apple/TiC-CLIP-basic-cumulative" |
|
huggingface-cli download $REPO checkpoints/$YEAR.pt |
|
|
|
## Train Cummulative |
|
pushd datacomp |
|
final_data_dir=$TIC_DATACOMP_Y_PATH/train/$YEAR/ |
|
torchrun --nproc_per_node 8 --nnodes 1 \ |
|
train.py \ |
|
--scale "tic_medium" \ |
|
--dataset_resampled \ |
|
--data_dir $final_data_dir \ |
|
--output_dir "./results/" \ |
|
--exp_name "datacomp_medium-basic_cumulative" \ |
|
--imagenet_val $IMAGENET_VAL_PATH \ |
|
--save_frequency 1 \ |
|
--resume |
|
popd |
|
|
|
## Evaluate Model |
|
# Evaluate a ViT-B/16 model on TiC/Retrieval/Yearly/$YEAR and |
|
# TiC/DataCompNet/Yearly/$YEAR |
|
pushd datacomp |
|
python ../dataset_creation/tic-datacomp/generate_tasklist.py --yaml-path tasklist.yml --sample-eval --eval-tasks retrieval/yearly,datacompnet/yearly |
|
python evaluate.py --data_dir data/ --train_output_dir ./results --use_model "ViT-B-16 $YEAR.pt" --skip_hf --skip_db --skip_notification |
|
``` |
|
|
|
## Training Details |
|
|
|
### Training Data |
|
|
|
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. --> |
|
|
|
[More Information Needed] |
|
|
|
### Training Procedure |
|
|
|
Please refer to Sections 2-3 of our [TiC-CLIP](https://github.com/apple/ml-tic-clip) paper. |
|
|
|
#### Preprocessing [optional] |
|
|
|
[More Information Needed] |
|
|
|
|
|
#### Training Hyperparameters |
|
|
|
- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision --> |
|
|
|
## Evaluation |
|
|
|
<!-- This section describes the evaluation protocols and provides the results. --> |
|
|
|
### Testing Data, Factors & Metrics |
|
|
|
#### Testing Data |
|
|
|
<!-- This should link to a Dataset Card if possible. --> |
|
|
|
[More Information Needed] |
|
|
|
#### Metrics |
|
|
|
<!-- These are the evaluation metrics being used, ideally with a description of why. --> |
|
|
|
[More Information Needed] |
|
|
|
### Results |
|
|
|
[More Information Needed] |
|
|
|
#### Summary |
|
|
|
|
|
|
|
## Environmental Impact |
|
|
|
<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly --> |
|
|
|
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). |
|
|
|
- **Hardware Type:** [More Information Needed] |
|
- **Hours used:** [More Information Needed] |
|
- **Carbon Emitted:** [More Information Needed] |
|
|
|
## Technical Specifications [optional] |
|
|
|
### Model Architecture and Objective |
|
|
|
[More Information Needed] |
|
|
|
### Compute Infrastructure |
|
|
|
[More Information Needed] |
|
|
|
#### Hardware |
|
|
|
[More Information Needed] |
|
|
|
#### Software |
|
|
|
[More Information Needed] |
|
|
|
## Citation [optional] |
|
|
|
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. --> |
|
|
|
**BibTeX:** |
|
|
|
[More Information Needed] |
|
|