NaN in model parameters

#48
by cuong-dyania - opened

I am not sure if this was bug from transformers==4.46.1 library. When I load the model with this version, it raised the warning message "Some weights of MllamaForCausalLM were not initialized from the model checkpoint at meta-llama/Llama-3.2-11B-Vision and are newly initialized".
After checking, there are some NaN in model parameters. However, the model was loaded without any warning or any issues with transforemrs==4.45.2.

import torch
from transformers import (
    AutoConfig,
    AutoModelForCausalLM)

model_name_or_path ="meta-llama/Llama-3.2-11B-Vision"
def check_for_nan_parameters(model):
    for name, param in model.named_parameters():
        if torch.isnan(param).any():  # Check if any value in the parameter tensor is NaN
            print(f"NaN found in parameter: {name}")
            return True
    print("No NaNs found in model parameters.")
    return False


# model = AutoModelForCausalLM.from_pretrained(model_name_or_path,
#         torch_dtype=torch.bfloat16,device_map ='auto')
# probably the conversion to bfloat16, seems no because the default data type of model parameters in hF repo is bfloat16 
model = AutoModelForCausalLM.from_pretrained(model_name_or_path,
         torch_dtype=torch.float16,device_map ='cpu')
print(check_for_nan_parameters(model))```

using 4.45.2 can avoid this issue

Sign up or log in to comment