webshart/conceptual-captions-12m-webdataset-metadata
Updated
•
82
None defined yet.
The webshart format is a community-driven, loosely-organised attempt at pushing a better standard for dataset metadata.
The datasets here are either a converted dataset from a third-party source (such as CC12M) or were created by SimpleTuner or CaptionFlow community members.