Speaker embeddings take a long time to generate

#2
by ugotsoul - opened

Hello,

I'm generated speaker embeddings using Modzilla common voice's dev set, which has about 16k samples for two locales I'm looking at (EN & CA). It takes about 5-6 hours on cpu to generate embeddings, compared to about an hour for ECAPA-TDNN. Is this normal for Resnet?

CPU: Intel(R) Xeon(R) Platinum 8253 CPU @ 2.20GHz with 64 cores

Sign up or log in to comment