jacktol commited on
Commit
8e04c74
1 Parent(s): 41832a6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -5
README.md CHANGED
@@ -63,13 +63,14 @@ The fine-tuned Whisper model is designed for:
63
 
64
  You can test the model online using the [ATC Transcription Assistant](https://huggingface.co/spaces/jacktol/ATC-Transcription-Assistant), which lets you upload audio files and generate transcriptions.
65
 
66
- ## Dataset
 
 
67
 
68
- The dataset used for fine-tuning includes:
69
- - **ATCO2**: An air traffic control dataset featuring real-world communications, including a freely available 1-hour test subset.
70
- - **UWB-ATCC**: A manually transcribed ATC corpus containing thousands of hours of recordings, focusing on air traffic communications.
71
 
72
- For more details on the dataset, refer to the **[ATC Dataset page](https://huggingface.co/datasets/jacktol/atc-dataset)**.
73
 
74
  ## Training Procedure
75
 
 
63
 
64
  You can test the model online using the [ATC Transcription Assistant](https://huggingface.co/spaces/jacktol/ATC-Transcription-Assistant), which lets you upload audio files and generate transcriptions.
65
 
66
+ ## Model Description
67
+
68
+ Whisper Medium EN fine-tuned for ATC is optimized to handle short, distinct transmissions between pilots and air traffic controllers. It is fine-tuned using data from the **[ATC Dataset](https://huggingface.co/datasets/jacktol/atc-dataset)**, a combined and cleaned dataset sourced from the following:
69
 
70
+ - **[ATCO2 corpus](https://huggingface.co/datasets/Jzuluaga/atco2_corpus_1h)** (1-hour test subset)
71
+ - **[UWB-ATCC corpus](https://huggingface.co/datasets/Jzuluaga/uwb_atcc)**
 
72
 
73
+ The **ATC Dataset** merges these two original sources, filtering and refining the data to enhance transcription accuracy for domain-specific ATC communications.
74
 
75
  ## Training Procedure
76