Fix AutoModel not loading model correctly due to config_class inconsistency
This fixes an issue when using AutoModel to instantiate the model where the config class instantiated with the model is from the transformers library instead of the model's module. This causes the instantiation to fail with the error below. See this Github issue for more details.
Traceback (most recent call last):
model = AutoModel.from_pretrained("zhihan1996/DNABERT-2-117M", trust_remote_code=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".../lib/python3.11/site-packages/transformers/models/auto/auto_factory.py", line 560, in from_pretrained
cls.register(config.__class__, model_class, exist_ok=True)
File ".../lib/python3.11/site-packages/transformers/models/auto/auto_factory.py", line 586, in register
raise ValueError(
ValueError: The model class you are passing has a `config_class` attribute that is not consistent with the config class you passed (model has <class 'transformers.models.bert.configuration_bert.BertConfig'> and you passed <class 'transformers_modules.zhihan1996.DNABERT-2-117M.d064dece8a8b41d9fb8729fbe3435278786931f1.configuration_bert.BertConfig'>. Fix one of those so they match!
Same problem encountered. But It does not happen 1 month ago when I use it first time.
Same problem for me
I'm having the same issue.
I'm having the same issue.
This issue could be fixed following https://huggingface.co/zhihan1996/DNABERT-2-117M/commit/6617c7e3829423fddd80ba03c7c7dc4f8aab4d19
having the same issue, what is the solution?
@mmokoatle
This worked for me:
tokenizer = AutoTokenizer.from_pretrained("zhihan1996/DNABERT-2-117M", trust_remote_code=True)
config = BertConfig.from_pretrained("zhihan1996/DNABERT-2-117M")
model = AutoModel.from_pretrained("zhihan1996/DNABERT-2-117M", trust_remote_code=True, config=config)
@mmokoatle Maybe you should show more code :)
@GCabas , apologies, see full code and error below
import torch
from transformers import AutoTokenizer, AutoModel, BertConfig
tokenizer = AutoTokenizer.from_pretrained("zhihan1996/DNABERT-2-117M", trust_remote_code=True)
config = BertConfig.from_pretrained("zhihan1996/DNABERT-2-117M")
model = AutoModel.from_pretrained("zhihan1996/DNABERT-2-117M", trust_remote_code=True, config=config)
dna = "ACGTAGCATCGGATCTATCTATCGACACTTGGTTATCGATCTACGAGCATCTCGTTAGC"
inputs = tokenizer(dna, return_tensors = 'pt')["input_ids"]
hidden_states = model(inputs)[0] # [1, sequence_length, 768] ###error from this line of code
error: "AssertionError: "
@mmokoatle
I did an example notebook using this model, maybe you could watch it and resolve your doubts :)
https://www.kaggle.com/code/gabrielcabas/dnabert-for-classification
@mmokoatle transformers 4.42.3