NaN in model parameters
#48
by
cuong-dyania
- opened
I am not sure if this was bug from transformers==4.46.1 library. When I load the model with this version, it raised the warning message "Some weights of MllamaForCausalLM were not initialized from the model checkpoint at meta-llama/Llama-3.2-11B-Vision and are newly initialized".
After checking, there are some NaN in model parameters. However, the model was loaded without any warning or any issues with transforemrs==4.45.2.
import torch
from transformers import (
AutoConfig,
AutoModelForCausalLM)
model_name_or_path ="meta-llama/Llama-3.2-11B-Vision"
def check_for_nan_parameters(model):
for name, param in model.named_parameters():
if torch.isnan(param).any(): # Check if any value in the parameter tensor is NaN
print(f"NaN found in parameter: {name}")
return True
print("No NaNs found in model parameters.")
return False
# model = AutoModelForCausalLM.from_pretrained(model_name_or_path,
# torch_dtype=torch.bfloat16,device_map ='auto')
# probably the conversion to bfloat16, seems no because the default data type of model parameters in hF repo is bfloat16
model = AutoModelForCausalLM.from_pretrained(model_name_or_path,
torch_dtype=torch.float16,device_map ='cpu')
print(check_for_nan_parameters(model))```
using 4.45.2 can avoid this issue