How to get the confidence of the transcribed result?
#2
by
matthew36
- opened
How can I get the confidence score of the result?
Hey matthew, you can get the probabilities of each token like this:
Loading model
import librosa
import torch
from transformers import WhisperForConditionalGeneration, WhisperProcessor
y, sr = librosa.load('audio.mp3', sr=16000)
MODEL_NAME = "alvanlii/whisper-small-cantonese"
processor = WhisperProcessor.from_pretrained(MODEL_NAME)
model = WhisperForConditionalGeneration.from_pretrained(MODEL_NAME).cuda()
model.config.forced_decoder_ids = None
model.config.suppress_tokens = []
model.config.use_cache = False
Generate output, note the output_scores
flag
processed_in = processor(y, sampling_rate=sr, return_tensors="pt")
gout = model.generate(
input_features=processed_in.input_features.cuda(),
output_scores=True, return_dict_in_generate=True
)
Compute softmax from the scores
proba_scores = [torch.nn.functional.softmax(gout.scores[idx]).max() for idx in range(len(gout.scores))]
# the ids are now in .sequences
transcription = processor.batch_decode(gout.sequences, skip_special_tokens=True)[0]
print(transcription)
Thank you so much alvanlii! I will try it later. 感謝大佬
可吾可以試下convert 去ggml
在whisper cpp 行
可吾可以試下convert 去ggml
在whisper cpp 行
你可以自己行 convert-h5-to-ggml.py
sorry didn't see this earlier, here it is: https://huggingface.co/alvanlii/whisper-small-cantonese/blob/main/ggml-model.bin
alvanlii
changed discussion status to
closed