Key Error with Latest Change
The latest config change (https://huggingface.co/nghuyong/ernie-2.0-base-en/commit/2c22755178879588695a30d68a4d9e861237db7b) where the BertModel is changed to ErnieModel under architecture is causing issues for loading all the models. The pretrained method from Transformers doesn't have ERNIE configs and so you get the following error when trying to load the both the tokenizer and model using the transformers automethods:
Traceback (most recent call last):
File "prepro_std_fin.py", line 297, in <module>
main(args)
File "prepro_std_fin.py", line 266, in main
tokenizer = AutoTokenizer.from_pretrained("nghuyong/ernie-1.0-base-zh")
File "/opt/conda/lib/python3.6/site-packages/transformers/models/auto/tokenization_auto.py", line 402, in from_pretrained
config = AutoConfig.from_pretrained(pretrained_model_name_or_path, **kwargs)
File "/opt/conda/lib/python3.6/site-packages/transformers/models/auto/configuration_auto.py", line 432, in from_pretrained
config_class = CONFIG_MAPPING[config_dict["model_type"]]
KeyError: 'ernie'
you could install transformers by :
pip install git+https://github.com/huggingface/transformers.git@main
that doesn't resolve the issue. I get this error by trying to load Ernie using the standard AutoModel calls
e.g.
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("nghuyong/ernie-2.0-base-en")
model = AutoModel.from_pretrained("nghuyong/ernie-2.0-base-en")
For example see this colab: https://colab.research.google.com/drive/1axnSZON454snEBFl8sYZQBNWsOwo-rtj?usp=sharing