Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,30 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
pipeline_tag: audio-classification
|
3 |
+
---
|
4 |
+
This repository contains the models submitted to Task 1 of the [DCASE 2024 Challenge](https://dcase.community/challenge2024/)
|
5 |
+
|
6 |
+
## Description
|
7 |
+
|
8 |
+
The task is to develop a data-efficient and low-complexity acoustic scene classification system.
|
9 |
+
The challenge dataset consists of 1 second audio clips from one of 10 classes: `airport`, `bus`, `metro`, `metro_station`, `park`, `public_square`, `shopping_mall`, `street_pedestrian`, `street_traffic`, `tram`. Five models are trained on splits of the training data: 5%, 10%, 25%, 50%, and 100%, respectively.
|
10 |
+
|
11 |
+
We chose to use the baseline model architecture and apply a target-specific training process which involves a pretraining dataset that is pruned to match the target dataset. Knowledge distillation is used to transfer knowledge from a pre-trained audio tagging ensemble to the target model.
|
12 |
+
A technical report describing the training process can be found [here](https://dcase.community/documents/challenge2024/technical_reports/DCASE2024_Werning_48_t1.pdf)
|
13 |
+
|
14 |
+
## Results
|
15 |
+
|
16 |
+
The full results of all participants can be found here:
|
17 |
+
https://dcase.community/challenge2024/task-data-efficient-low-complexity-acoustic-scene-classification-results
|
18 |
+
|
19 |
+
The results of our submission compared to the baseline on the evaluation data are as follows:
|
20 |
+
|
21 |
+
| Name | Official rank | Rank score | Split 5% | Split 10% | Split 25% | Split 50% | Split 100% |
|
22 |
+
|--|--|--|--|--|--|--|--|
|
23 |
+
| Werning_UPBNT | 8 | 54.35 | 49.21 % | 52.51 % | 55.49 % | 56.20 % | 58.34 % |
|
24 |
+
| Baseline | 17 | 50.73 | 44.00 % | 46.95 % | 51.47 % | 54.40 % | 56.84 % |
|
25 |
+
|
26 |
+
## Usage
|
27 |
+
|
28 |
+
The example notebook shows how to predict the acoustic scene for a given audio file using the models.
|
29 |
+
|
30 |
+
The model code is adapted from the baseline repository: https://github.com/CPJKU/dcase2024_task1_baseline
|