Fix weights by putting the right value in `lm_head.weight`
There was probably a bug in the initial conversion script that created those models, as the weights they have have a
different value for lm_head.weight
and model.decoder.embed_tokens.weight
. Those models are tied though.
This was not a problem until now as the model was tied after the load and the (wrong) value of lm_head.weight
was
replaced by the value of model.decoder.embed_tokens.weight
. This does not work any more if we tie the weights before
the load however, as the value picked might be the one from lm_head.weight
depending on how the models are tied.
As far as I can see, the model stop generating properly on Transformers main.
This should fix the bug without any side effect.
Thanks I can't merge but this fixes the issues for these models
Thank you very much. I'm a bit confused though.
I want to convert a Marian MT model (from Tatoeba-Challenge) to PyTorch, so as to use it with HF locally.
In order to apply this fix, should I make changes to the MarianMTModel or in the conversion script as well?
If you use the latest release of transformers, the conversion should work out of the box! Does it not?