forced_decoder_ids not applied properly when generation
#10
by
minseong-ringle
- opened
input_features = processor(input, return_tensors="pt").input_features
forced_decoder_ids = processor.get_decoder_prompt_ids(language = "en", task = "transcribe", no_timestamps=False)
predicted_ids = model.generate(input_features, forced_decoder_ids = forced_decoder_ids)
transcription = processor.batch_decode(predicted_ids)
# This results in
# tensor([[50258, 50259, 50359, 50363
# -> "<|startoftranscript|><|en|><|transcribe|><|notimestamps|>
# for transcription
model.config.forced_decoder_ids = processor.get_decoder_prompt_ids(language = "en", task = "transcribe", no_timestamps=False)
# also using this cause the same result.
Here are some code snippets I've tried so far.
I cannot remove notimestamps token as a decoder input.
Any rescues?
Thank you for your help in advance.
Hey! So this might be related to the fact that the "<|notimestamps|>"
token is not in the list of suppress tokens! This means that the model is just predicting this token.
We should probably add it to the list of the suppress_tokens