KingNish/Realtime-whisper-large-v3-turbo · Having trouble deploying locally

The realtime transcription works great in spaces but I am having trouble getting it to run locally on my Ubuntu 22.04 machine.

Here are my issues:

In the microphone block, it tells me that time_limit=45, stream_every=2 don't exist. the only workaround that I found is replacing this piece of code with every=2
The transcription accuracy is extremely low, it is nothing like the accuracy displayed in the spaces app.
Here are the errors/warnings that I get while running it locally:

/opt/conda/envs/transcriptor/lib/python3.9/site-packages/transformers/models/whisper/generation_whisper.py:496: FutureWarning: The input name inputs is deprecated. Please make sure to use input_features instead.
Due to a bug fix in https://github.com/huggingface/transformers/pull/28687 transcription using a multilingual Whisper will default to language detection followed by transcription instead of translation to English.This might be a breaking change for your use case. If you want to instead always translate your audio to English, make sure to pass language='en'.
Passing a tuple of past_key_values is deprecated and will be removed in Transformers v4.43.0. You should pass an instance of EncoderDecoderCache instead, e.g. past_key_values=EncoderDecoderCache.from_legacy_cache(past_key_values).
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results.
You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a dataset.

Here is a list of the packages installed in my Conda environment if that might have been the issue: