Edit model card

ConvNext (trained on XCL from BirdSet)

ConvNext trained on the XCL dataset from BirdSet, covering 9736 bird species from Xeno-Canto.

Model Details

ConvNeXT is a pure convolutional model (ConvNet), inspired by the design of Vision Transformers, that claims to outperform them.

How to use

The BirdSet data needs a custom processor that is available in the BirdSet repository. The model does not have a processor available. The model accepts a mono image (spectrogram) as input (e.g., torch.Size([16, 1, 128, 1024]))

  • The model is trained on 5-second clips of bird vocalizations.
  • num_channels: 1
  • pretrained checkpoint: facebook/convnext-base-224-22k
  • sampling_rate: 32_000
  • normalize spectrogram: mean: -4.268, std: 4.569 (from esc-50)
  • spectrogram: n_fft: 1024, hop_length: 320, power: 2.0
  • melscale: n_mels: 128, n_stft: 513
  • dbscale: top_db: 80

Citation

Downloads last month
4
Safetensors
Model size
97.5M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.