Very high loss compared to keras

#46
by tanimazsin130 - opened

When I use H5 weights, the loss is stable and satisfactory. However, when I switch to using HuggingFace model weights with the HuggingFace trainer, the loss begins at 50 and does not decrease below 5. I believe there is a need for a proper conversion of the model weights, as the current weights seem ineffective.

tanimazsin130 changed discussion title from Very high loss compared to keras 3 to Very high loss compared to keras

I also encountered the same problem. I have always suspected that it is my own problem...

Google org

@osanseviero Any idea if this has to do with recent fixes made about model inconsistencies?

Google org

ROPE should already have solved some of them, we are looking into the loss

I'm also experiencing the same.

Can you try on the latest transformers? We recently fixed some issues with respect to training stability which has been pushed to a patch release

Google org
edited Jul 3

Hi @tanimazsin130 , Could you try with the latest transformers release (v4.42.3) and let us know if it fixes your problem?

Sign up or log in to comment