Merged model loading issues?
so when i am load the merged model in colab i am facing this issue:
/usr/local/lib/python3.10/dist-packages/transformers/quantizers/quantizer_bnb_4bit.py in create_quantized_param(self, model, param_value, param_name, target_device, state_dict, unexpected_keys)
189 param_name + ".quant_state.bitsandbytes__nf4" not in state_dict
190 ):
--> 191 raise ValueError(
192 f"Supplied state dict for {param_name} does not contain bitsandbytes__*
and possibly other quantized_stats
components."
193 )
ValueError: Supplied state dict for model.layers.15.self_attn.k_proj.weight does not contain bitsandbytes__*
and possibly other quantized_stats
components.
can any one help out in this?
my target layers for lora was [
"q_proj",
"k_proj",
"v_proj",
"o_proj",
"gate_proj",
"up_proj",
"down_proj",
"lm_head",
]
Can you share how you merged it?
Can you share how you merged it?
Assuming you have already trained your model and have the trainer object
adapter_model = trainer.model
merged_model = adapter_model.merge_and_unload()
Retrieve the trained tokenizer
trained_tokenizer = trainer.tokenizer
Define the directory where you want to save the model and tokenizer
save_directory = "/content/merge"
Save the merged model
merged_model.save_pretrained(save_directory)
Save the tokenizer
trained_tokenizer.save_pretrained(save_directory)
this is how i merged it
also at times when i am trying to run inference on the peft adpater i am facing issues like
ValueError: Unrecognized model in /content/drive/MyDrive/final. Should have a model_type
key in its config.json, or contain one of the following strings in its name: albert, align, altclip, audio-spectrogram-transformer, autoformer, bark, bart, beit, bert, bert-generation, big_bird, bigbird_pegasus, biogpt, bit, blenderbot, blenderbot-small, blip, blip-2, bloom, bridgetower, bros, camembert, canine, chameleon, chinese_clip, chinese_clip_vision_model, clap, clip, clip_vision_model, clipseg, clvp, code_llama, codegen, cohere, conditional_detr, convbert, convnext, convnextv2, cpmant, ctrl, cvt, data2vec-audio, data2vec-text, data2vec-vision, dbrx, deberta, deberta-v2, decision_transformer, deformable_detr, deit, depth_anything, deta, detr, dinat, dinov2, distilbert, donut-swin, dpr, dpt, efficientformer, efficientnet, electra, encodec, encoder-decoder, ernie, ernie_m, esm, falcon, fastspeech2_conformer, flaubert, flava, fnet, focalnet, fsmt, funnel, fuyu, gemma, gemma2, git, glpn, gpt-sw3, gpt2, gpt_bigcode, gpt_neo, gpt_neox, gpt_neox_japanese, gptj, gptsan-japanese, graphormer, grounding-dino, groupvit, hiera, hubert, ibert, idefics, idefics2, imagegpt, informer, instructblip, instructblipvideo, jamba, jetmoe, jukebox, kosmos-2, layoutlm, layoutlmv2, layoutlmv3, led, levit, lilt, llama, llava, llava-next-video, llava_next, longformer, longt5, luke, lxmert, m2m_100, mamba, mamba2, marian, markuplm, mask2former, maskformer, maskformer-swin, mbart, mctct, mega, me…
this seems very new error as even yesterday i was able to run inference. my transformers model is 4.42.0.
Can you share what is in your config.json file?
Can you confirm that the filesize of the weights makes sense? If you saved it in 4-bit precision, it should be about 6GB total. If it is 24GB, then it was saved in fp16 or bf16
hey the model safe tensors are around 11gb combined totally. and here is my config json:
{
"_name_or_path": "mistralai/Mistral-Nemo-Instruct-2407",
"architectures": [
"MistralForCausalLM"
],
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 128,
"hidden_act": "silu",
"hidden_size": 5120,
"initializer_range": 0.02,
"intermediate_size": 14336,
"llm_int8_enable_fp32_cpu_offload": false,
"max_position_embeddings": 1024000,
"model_type": "mistral",
"num_attention_heads": 32,
"num_hidden_layers": 40,
"num_key_value_heads": 8,
"pretraining_tp": 1,
"quantization_config": {
"_load_in_4bit": true,
"_load_in_8bit": false,
"bnb_4bit_compute_dtype": "bfloat16",
"bnb_4bit_quant_storage": "uint8",
"bnb_4bit_quant_type": "nf4",
"bnb_4bit_use_double_quant": false,
"llm_int8_enable_fp32_cpu_offload": false,
"llm_int8_has_fp16_weight": false,
"llm_int8_skip_modules": null,
"llm_int8_threshold": 6.0,
"load_in_4bit": true,
"load_in_8bit": false,
"quant_method": "bitsandbytes"
},
"rms_norm_eps": 1e-05,
"rope_theta": 1000000.0,
"sliding_window": null,
"tie_word_embeddings": false,
"torch_dtype": "float32",
"transformers_version": "4.45.0.dev0",
"use_cache": false,
"vocab_size": 131072
}
thanks for the reply
- It isn't recommended to merge at 4-bit because rounding errors can make the results bad
- If you have the adapters saved, I would try first loading the base model in 4bit precision, and then adding the trained adapters, and then merging and saving the model.
Example:
from peft import PeftConfig, PeftModel
from transformers import AutoModelForCausalLM
import torch
config = PeftConfig.from_pretrained("smangrul/tinyllama_lora_norobots")
model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, load_in_4bit=True, device_map="auto").eval()
model = PeftModel.from_pretrained(model, "smangrul/tinyllama_lora_norobots")
merged_model = model.merge_and_unload()
merged_model.save_pretrained("merged")
Also make sure to upgrade bitsandbytes
and transformers
to newest versions
- It isn't recommended to merge at 4-bit because rounding errors can make the results bad
- If you have the adapters saved, I would try first loading the base model in 4bit precision, and then adding the trained adapters, and then merging and saving the model.
Example:
from peft import PeftConfig, PeftModel from transformers import AutoModelForCausalLM import torch config = PeftConfig.from_pretrained("smangrul/tinyllama_lora_norobots") model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, load_in_4bit=True, device_map="auto").eval() model = PeftModel.from_pretrained(model, "smangrul/tinyllama_lora_norobots") merged_model = model.merge_and_unload() merged_model.save_pretrained("merged")
Also make sure to upgrade
bitsandbytes
andtransformers
to newest versions
hey thanks for the reply as u said i did load my base model in 4bit version and then merged it with the adapters .
once i did that i was thrown with the below when loading the model:
ValueError: Supplied state dict for model.layers.15.self_attn.k_proj.weight does not contain bitsandbytes__*
and possibly other quantized_stats
components.
it maked me feel that this particluar version doesnt allow me to add the lora adpaters to the base model as there is a change in parameters.let me know what you think thanks
You could open an issue in transformers on github