File size: 1,413 Bytes
671f7ac 78563d0 671f7ac |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 |
---
tags:
- pyannote
- audio
- voice-activity-detection
datasets:
- dihard
license: mit
inference: false
---
## Example pyannote-audio Voice Activity Detection model
### `pyannote.audio.models.segmentation.PyanNet`
♻️ Imported from https://github.com/pyannote/pyannote-audio-hub
This model was trained by @hbredin.
### Demo: How to use in pyannote-audio
```python
from pyannote.audio.core.inference import Inference
model = Inference('julien-c/voice-activity-detection', device='cuda')
model({
"audio": "TheBigBangTheory.wav"
})
```
### Citing pyannote-audio
```bibtex
@inproceedings{Bredin2020,
Title = {{pyannote.audio: neural building blocks for speaker diarization}},
Author = {{Bredin}, Herv{\'e} and {Yin}, Ruiqing and {Coria}, Juan Manuel and {Gelly}, Gregory and {Korshunov}, Pavel and {Lavechin}, Marvin and {Fustes}, Diego and {Titeux}, Hadrien and {Bouaziz}, Wassim and {Gill}, Marie-Philippe},
Booktitle = {ICASSP 2020, IEEE International Conference on Acoustics, Speech, and Signal Processing},
Address = {Barcelona, Spain},
Month = {May},
Year = {2020},
}
```
```bibtex
@inproceedings{Lavechin2020,
author = {Marvin Lavechin and Marie-Philippe Gill and Ruben Bousbib and Herv\'{e} Bredin and Leibny Paola Garcia-Perera},
title = {{End-to-end Domain-Adversarial Voice Activity Detection}},
year = {2020},
url = {https://arxiv.org/abs/1910.10655},
}
```
|