Speaker embeddings take a long time to generate
#2
by
ugotsoul
- opened
Hello,
I'm generated speaker embeddings using Modzilla common voice's dev set, which has about 16k samples for two locales I'm looking at (EN & CA). It takes about 5-6 hours on cpu to generate embeddings, compared to about an hour for ECAPA-TDNN. Is this normal for Resnet?
CPU: Intel(R) Xeon(R) Platinum 8253 CPU @ 2.20GHz with 64 cores