Audio Classification
This repo contains code and notes for this tutorial.
Dataset
GTZAN is used.
Usage
export HUGGINGFACE_TOKEN=<your_token>
python main.py
Performance
Acc: 0.81 (default setting)
Notes
🤗 Datasets support
train_test_split()
method to split the dataset.feature_extractor
can not handle resampling- To resample, one can use
dataset.map()
- To resample, one can use
from datasets import Audio
gtzan = gtzan.cast_column("audio", Audio(sampling_rate=feature_extractor.sampling_rate))
feature_extractor
do the normalization and returnsinput_values
andattention_mask
..map()
support batched preprocess.Why
AutoModelForAudioClassification.from_pretrained
takeslabel2id
andid2label
?
- Downloads last month
- 4
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.