Spaces:
Runtime error
Runtime error
File size: 2,029 Bytes
1990861 624fa35 1990861 624fa35 1990861 624fa35 1990861 624fa35 1990861 624fa35 02c67bc 888bf57 02c67bc 92e5891 1990861 e5d235c 1990861 92e5891 1990861 888bf57 92e5891 1faa427 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
## Dataset
- [UrbanSound8K](https://urbansounddataset.weebly.com/urbansound8k.html)
## Audio files
Files are converted to melspectrograms that perform better in general for visual transformations of such audio files.
## Training
Using With Fast.ai and three epochs with minimal lines of code approaches 95% accuracy with a 20% validation of the entire dataset of 8732 labelled sound excerpts of 10 classes shown above. Fast.ai was used to train this classifier with a Resnet34 vision learner with three epochs.
| epoch | train_loss | valid_loss | accuracy | time |
|-------|------------|-------------|-------------|-------|
|0 | 1.462791 | 0.710250 | 0.775487 | 01:12 |
| 0 | 0.600056 | 0.309964 | 0.892325 | 00:40 |
|1 | 0.260431 | 0.200901 | 0.945017 | 00:39 |
| 2 | 0.090158 | 0.164748 | 0.950745 | 00:40 |
## Classical Approaches
[Classical approaches on this dataset as of 2019](https://www.researchgate.net/publication/335862311_Evaluation_of_Classical_Machine_Learning_Techniques_towards_Urban_Sound_Recognition_on_Embedded_Systems)
## State of the Art Approaches
The state-of-the-art methods for audio classification approach this problem as an image classification task. For such image classification problems from audio samples, [three common](https://scottmduda.medium.com/urban-environmental-audio-classification-using-mel-spectrograms-706ee6f8dcc1)
transformation approaches are:
- Linear Spectrograms
- Log Spectrograms
- [Mel Spectrograms](https://towardsdatascience.com/audio-deep-learning-made-simple-part-2-why-mel-spectrograms-perform-better-aad889a93505)
## Credits
Thanks to [Kurian Benoy](https://kurianbenoy.com/) and countless others that generously leave code in github to follow or write blogs that explain various things online.
## Code Repo & Blog
Additional details on my [Github Repo](https://github.com/gputrain/fastai2-coursework/tree/main/HW) and [my blog](https://www.gputrain.com/) where I will add additional details on this fast ai build, audio transforms and more. |