mac support

by kunci115 - opened Sep 6

Sep 6

forked from mini omni, it works end to end with mac m1, m2, m3 chip, but whisper seems a problem while transcribing the voice, if anyone can take a look what's wrong
https://github.com/kunci115/mini-omni-mac-support

leoromanovsky

Sep 7

Could you please more specifically describe "seems a problem". What are you observing and what are you expecting to happen?

kunci115

Sep 7

transcribing is not expected as the input speak, its too far, seems a problem means there is something wrong with the way I'm using whisper in that code

gpt-omni

Owner Sep 9

https://github.com/kunci115/mini-omni-mac-support/blob/main/inference.py#L359
the model is trained with whisper small, but you load tiny. That might be the problem?

cesinsingapore

Sep 9

Change it back to small, but still the same result

gpt-omni

Owner Sep 13

https://github.com/openai/whisper/blob/279133e3107392276dc509148da1f41bfb532c7e/whisper/model.py#L256
please refer to the whisper model, we use embed_audio for encode the audio feature(use only the encoder part of the whisper), and I think it's not the same as your output.logits which is the output of decoder?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment