Update README.md
Browse files
README.md
CHANGED
@@ -9,6 +9,55 @@ datasets:
|
|
9 |
- ProgramComputer/voxceleb
|
10 |
---
|
11 |
|
12 |
-
|
13 |
-
-
|
14 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
9 |
- ProgramComputer/voxceleb
|
10 |
---
|
11 |
|
12 |
+
# Voice gender classifier
|
13 |
+
- This repo contains the inference code to use pretrained human voice gender classifier.
|
14 |
+
- You could also try 🤗[Huggingface online demo](https://huggingface.co/spaces/JaesungHuh/voice-gender-classifier).
|
15 |
+
|
16 |
+
## Installation
|
17 |
+
First, clone the original [github repository](https://github.com/JaesungHuh/voice-gender-classifier)
|
18 |
+
```
|
19 |
+
git clone https://github.com/JaesungHuh/voice-gender-classifier.git
|
20 |
+
```
|
21 |
+
|
22 |
+
and install the packages via pip.
|
23 |
+
|
24 |
+
```
|
25 |
+
cd voice-gender-classifier
|
26 |
+
pip install -r requirements.txt
|
27 |
+
```
|
28 |
+
|
29 |
+
## Usage
|
30 |
+
```
|
31 |
+
import torch
|
32 |
+
|
33 |
+
from model import ECAPA_gender
|
34 |
+
|
35 |
+
# You could directly download the model from the huggingface model hub
|
36 |
+
model = ECAPA_gender.from_pretrained("JaesungHuh/ecapa-gender")
|
37 |
+
model.eval()
|
38 |
+
|
39 |
+
# If you are using gpu ....
|
40 |
+
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
|
41 |
+
model.to(device)
|
42 |
+
|
43 |
+
# Load the audio file and use predict function to directly get the output
|
44 |
+
example_file = "data/00001.wav"
|
45 |
+
with torch.no_grad():
|
46 |
+
output = model.predict(example_file, device=device)
|
47 |
+
print("Gender : ", output)
|
48 |
+
```
|
49 |
+
|
50 |
+
## Pretrained weights
|
51 |
+
For those who need pretrained weights, please download it in [here](https://drive.google.com/file/d/1ojtaa6VyUhEM49F7uEyvsLSVN3T8bbPI/view?usp=sharing)
|
52 |
+
|
53 |
+
## Training details
|
54 |
+
State-of-the-art speaker verification model already produces good representation of the speaker's gender.
|
55 |
+
|
56 |
+
I used the pretrained ECAPA-TDNN from [TaoRuijie's](https://github.com/TaoRuijie/ECAPA-TDNN) repository, added one linear layer to make two-class classifier, and finetuned the model with the VoxCeleb2 dev set.
|
57 |
+
|
58 |
+
The model achieved **98.7%** accuracy on the VoxCeleb1 identification test split.
|
59 |
+
|
60 |
+
## Reference
|
61 |
+
- [Original github repository](https://github.com/JaesungHuh/voice-gender-classifier)
|
62 |
+
- I modified the model architecture from [TaoRuijie's](https://github.com/TaoRuijie/ECAPA-TDNN) repository.
|
63 |
+
- For more details about ECAPA-TDNN, check the [paper](https://arxiv.org/abs/2005.07143).
|