Inference tensors cannot be saved for backward. To work around you can make a clone to get a normal tensor and use it in autograd.
lora_r = 16
lora_alpha = 64
lora_dropout = 0.1
lora_target_modules = [
"q_proj",
"up_proj",
"o_proj",
"k_proj",
"down_proj",
"gate_proj",
"v_proj",
]
peft_config = LoraConfig(
r=lora_r,
lora_alpha=lora_alpha,
lora_dropout=lora_dropout,
target_modules=lora_target_modules,
bias="none",
task_type="CAUSAL_LM",
)
training_arguments = TrainingArguments(
per_device_train_batch_size=4,
gradient_accumulation_steps=4,
optim="paged_adamw_32bit",
logging_steps=1,
learning_rate=1e-4,
fp16=True,
max_grad_norm=0.3,
num_train_epochs=2,
evaluation_strategy="steps",
eval_steps=0.2,
warmup_ratio=0.05,
save_strategy="epoch",
group_by_length=True,
output_dir=OUTPUT_DIR,
report_to="tensorboard",
save_safetensors=True,
lr_scheduler_type="cosine",
seed=42,
)
trainer = SFTTrainer(
model=model,
train_dataset=data,
peft_config=peft_config,
dataset_text_field="text",
max_seq_length=4096,
tokenizer=tokenizer,
args=training_arguments,
)
trainer.train()
/usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py:464: UserWarning: torch.utils.checkpoint: the use_reentrant parameter should be passed explicitly. In version 2.4 we will raise an exception if use_reentrant is not passed. use_reentrant=False is recommended, but if you need to preserve the current default behavior, you can pass use_reentrant=True. Refer to docs for more details on the differences between the two variants.
warnings.warn(
RuntimeError Traceback (most recent call last)
in <cell line: 1>()
----> 1 trainer.train()
21 frames
~/.cache/huggingface/modules/transformers_modules/tiiuae/falcon-7b/898df1396f35e447d5fe44e0a3ccaaaa69f30d36/modeling_falcon.py in forward(self, query, key, past_key_values_length)
106 batch, seq_len, head_dim = query.shape
107 cos, sin = self.cos_sin(seq_len, past_key_values_length, query.device, query.dtype)
--> 108 return (query * cos) + (rotate_half(query) * sin), (key * cos) + (rotate_half(key) * sin)
109
110
RuntimeError: Inference tensors cannot be saved for backward. To work around you can make a clone to get a normal tensor and use it in autograd.
How do I resolve this issue ? Any help in debugging this is appreciated, thanks!
did u find the solution ?
Yes, I resolved the bug.