evaluate-metric/meteor · No max over multiple references?

Jun 28, 2022

Hi,

Isn't there any problem with the "mutliple references" case? I think the METEOR should compute the METEOR score for each reference of each hypothesis and keep the maximum score for each hypothesis, then return the mean over all predictions.

In NLTK, the meteor_score() function iterates over the multiple references and computes the max, but no mean is performed (this is up to the user).
In HuggingFace's wrapper, the single_meteor_score() function is directly accessed for each hypothesis and a mean is performed, but there is no loop over multiple references and no max.

But maybe I'm wrong :-).

Thanks,
Gwénolé.

lvwerra

Evaluate Metric org Jun 29, 2022

Currently, multiple references are not supported but @sasha is working on adding them: https://github.com/huggingface/evaluate/pull/164

The Hugging Face wrapper only works with a single reference per sample and then the mean over all samples is taken.

glecorve

Jun 29, 2022

Thanks for the confirmation. I patched it with another wrapper :-).

I think this would be helpful to mention this limitation in the documentation. Currently, it is written "references: a list of references for each prediction. Each reference should be a string with tokens separated by spaces.".

Thanks again.

glecorve

Jul 15, 2022

This has been patched now. Thank you.

glecorve changed discussion status to closed Jul 15, 2022