Text-to-Music / docs /DATASETS.md
reach-vb's picture
reach-vb HF staff
Stereo demo update (#60)
5325fcc

A newer version of the Gradio SDK is available: 5.12.0

Upgrade

AudioCraft datasets

Our dataset manifest files consist in 1-json-per-line files, potentially gzipped, as data.jsons or data.jsons.gz files. This JSON contains the path to the audio file and associated metadata. The manifest files are then provided in the configuration, as datasource sub-configuration. A datasource contains the pointers to the paths of the manifest files for each AudioCraft stage (or split) along with additional information (eg. maximum sample rate to use against this dataset). All the datasources are under the dset group config, with a dedicated configuration file for each dataset.

Getting started

Example

See the provided example in the directory that provides a manifest to use the example dataset provided under the dataset folder.

The manifest files are stored in the egs folder.

egs/
  example/data.json.gz

A datasource is defined in the configuration folder, in the dset group config for this dataset at config/dset/audio/example:

# @package __global__

datasource:
  max_sample_rate: 44100
  max_channels: 2

  train: egs/example
  valid: egs/example
  evaluate: egs/example
  generate: egs/example

For proper dataset, one should create manifest for each of the splits and specify the correct path to the given manifest in the datasource for each split.

Then, using a dataset through the configuration can be done pointing to the corresponding dataset configuration:

dset=<dataset_name> # <dataset_name> should match the yaml file name

# for example
dset=audio/example

Creating manifest files

Assuming you want to create manifest files to load with AudioCraft's AudioDataset, you can use the following command to create new manifest files from a given folder containing audio files:

python -m audiocraft.data.audio_dataset <path_to_dataset_folder> egs/my_dataset/my_dataset_split/data.jsonl.gz

# For example to generate the manifest for dset=audio/example
# note: we don't use any split and we don't compress the jsonl file for this dummy example
python -m audiocraft.data.audio_dataset dataset/example egs/example/data.jsonl

# More info with: python -m audiocraft.data.audio_dataset --help

Additional information

MusicDataset and metadata

The MusicDataset is an AudioDataset with additional metadata. The MusicDataset expects the additional metadata to be stored in a JSON file that has the same path as the corresponding audio file, but with a .json extension.

SoundDataset and metadata

The SoundDataset is an AudioDataset with descriptions metadata. Similarly to the MusicDataset, the SoundDataset expects the additional metadata to be stored in a JSON file that has the same path as the corresponding audio file, but with a .json extension. Additionally, the SoundDataset supports an additional parameter pointing to an extra folder external_metadata_source containing all the JSON metadata files given they have the same filename as the audio file.