Back to Storage

How HF Storage Buckets Powered the Largest Open Image-Text Dataset

Jasper

Jasper AI

June 2, 2026

Building MONET

The Challenge

Our goal was to contribute to the open-source ecosystem with a dataset that would benefit all researchers in the field of text-to-image generation. So we created MONET, a dataset that contains a curated set of hi-quality image, augmented by embeddings, captions and meta data.

Building MONET meant iterating on 2.9 billions of images through a multi-stage pipeline, deduplicate them, calculate 4 captions per images, multiple embeddings, enrich with metadata, and repackage everything. All of this against a hard NeurIPS deadline.

That work generates a constant stream of intermediate, mutable artifacts (processed shards, embeddings) that change often and don't belong in a versioned Git repo. Jasper needed fast, mutable, terabyte-scale storage that sat natively in the Hub ecosystem.


The Solution

Jasper used Hugging Face Buckets as the creation and storage backbone for MONET. We needed a solution that:

  • Easily manages large datasets. (MONET is 68TB)
  • Lets us train our model from the data.
  • Makes versioning easy.
  • Lets the community use it easily.

Buckets handled the raw, multi-terabyte data during processing, captioning, and embedding, while the final structured release lives as a versioned HF Dataset.

And thanks to XET, everyone can now copy the 68TB of MONET into their own bucket and start experimenting with it in seconds! Not hours or days, thanks to XET's native deduplication.

We are eager to see what MONET will unlock for the community now!

Damien Henry
Damien Henry

Senior VP at Jasper AI