What are the best practices for handling long audio of 2-3 hours?

#3
by Mbellish - opened

1st thanks for the great work. you are awesome!

I was experimenting with the model with very long files that are decent quality. (far from studio recording though).

the model runing locally on docker desktop with gpu accelerated and default setting like the model readme suggests.

the translation is ok-is, but i wasnt doing anything special. is there any idea how to make the most of it?
Thanks

Some things I've done already.
I was experimenting with faster-whisper github and used some tweaks with version 1.1.0 without significant improvements.

setting hotwords choke the model with some internal limitation of 448 tokens.
initial promp also used without issues running it, but with results that needs improvements.

Sign up or log in to comment