2024-11-16 23:21:31,464 - main - INFO - Loading tokenizer... 2024-11-16 23:21:32,228 - main - WARNING - Could not load custom vocabulary: property 'vocab' of 'GPT2TokenizerFast' object has no setter 2024-11-16 23:21:32,229 - main - INFO - Loading model... 2024-11-16 23:21:32,229 - main - ERROR - Model file not found at ./models\poeticagpt-quantized-new.pth 2024-11-16 23:21:32,231 - main - ERROR - Failed to initialize model manager 2024-11-16 23:30:46,037 - main - INFO - Loading tokenizer... 2024-11-16 23:30:46,798 - main - WARNING - Could not load custom vocabulary: property 'vocab' of 'GPT2TokenizerFast' object has no setter 2024-11-16 23:30:46,799 - main - INFO - Loading model... 2024-11-16 23:30:46,799 - main - ERROR - Error initializing model: Incorrect path_or_model_id: './models/poeticagpt.pth'. Please provide either the path to a local folder or the repo_id of a model on the Hub. 2024-11-16 23:30:46,800 - main - ERROR - Detailed traceback: Traceback (most recent call last): File "e:\Self Work\My Projects\Poetica HuggingFace Server\.venv\Lib\site-packages\transformers\utils\hub.py", line 402, in cached_file resolved_file = hf_hub_download( ^^^^^^^^^^^^^^^^ File "e:\Self Work\My Projects\Poetica HuggingFace Server\.venv\Lib\site-packages\huggingface_hub\utils\_validators.py", line 106, in _inner_fn validate_repo_id(arg_value) File "e:\Self Work\My Projects\Poetica HuggingFace Server\.venv\Lib\site-packages\huggingface_hub\utils\_validators.py", line 154, in validate_repo_id raise HFValidationError( huggingface_hub.errors.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': './models/poeticagpt.pth'. Use `repo_type` argument if needed. The above exception was the direct cause of the following exception: Traceback (most recent call last): File "E:\Self Work\My Projects\Poetica HuggingFace Server\poetica\main.py", line 88, in initialize self.model = AutoModelForCausalLM.from_pretrained(model_path, local_files_only=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "e:\Self Work\My Projects\Poetica HuggingFace Server\.venv\Lib\site-packages\transformers\models\auto\auto_factory.py", line 485, in from_pretrained resolved_config_file = cached_file( ^^^^^^^^^^^^ File "e:\Self Work\My Projects\Poetica HuggingFace Server\.venv\Lib\site-packages\transformers\utils\hub.py", line 466, in cached_file raise EnvironmentError( OSError: Incorrect path_or_model_id: './models/poeticagpt.pth'. Please provide either the path to a local folder or the repo_id of a model on the Hub. 2024-11-16 23:30:46,803 - main - ERROR - Failed to initialize model manager 2024-11-16 23:33:40,483 - main - INFO - Loading tokenizer... 2024-11-16 23:33:41,621 - main - WARNING - Could not load custom vocabulary: property 'vocab' of 'GPT2TokenizerFast' object has no setter 2024-11-16 23:33:41,622 - main - INFO - Loading model... 2024-11-16 23:33:43,332 - main - ERROR - Error initializing model: Error no file named pytorch_model.bin, model.safetensors, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory ./models/. 2024-11-16 23:33:43,333 - main - ERROR - Detailed traceback: Traceback (most recent call last): File "E:\Self Work\My Projects\Poetica HuggingFace Server\poetica\main.py", line 88, in initialize self.model = AutoModelForCausalLM.from_pretrained(model_path, local_files_only=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "e:\Self Work\My Projects\Poetica HuggingFace Server\.venv\Lib\site-packages\transformers\models\auto\auto_factory.py", line 564, in from_pretrained return model_class.from_pretrained( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "e:\Self Work\My Projects\Poetica HuggingFace Server\.venv\Lib\site-packages\transformers\modeling_utils.py", line 3447, in from_pretrained raise EnvironmentError( OSError: Error no file named pytorch_model.bin, model.safetensors, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory ./models/. 2024-11-16 23:33:43,335 - main - ERROR - Failed to initialize model manager 2024-11-16 23:34:18,283 - main - INFO - Loading tokenizer... 2024-11-16 23:34:18,966 - main - WARNING - Could not load custom vocabulary: property 'vocab' of 'GPT2TokenizerFast' object has no setter 2024-11-16 23:34:18,966 - main - INFO - Loading model... 2024-11-16 23:34:20,499 - main - ERROR - Error initializing model: Error no file named pytorch_model.bin, model.safetensors, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory ./models/. 2024-11-16 23:34:20,500 - main - ERROR - Detailed traceback: Traceback (most recent call last): File "E:\Self Work\My Projects\Poetica HuggingFace Server\poetica\main.py", line 88, in initialize self.model = AutoModelForCausalLM.from_pretrained(model_path, local_files_only=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "e:\Self Work\My Projects\Poetica HuggingFace Server\.venv\Lib\site-packages\transformers\models\auto\auto_factory.py", line 564, in from_pretrained return model_class.from_pretrained( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "e:\Self Work\My Projects\Poetica HuggingFace Server\.venv\Lib\site-packages\transformers\modeling_utils.py", line 3447, in from_pretrained raise EnvironmentError( OSError: Error no file named pytorch_model.bin, model.safetensors, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory ./models/. 2024-11-16 23:34:20,502 - main - ERROR - Failed to initialize model manager 2024-11-16 23:35:15,983 - main - INFO - Loading tokenizer... 2024-11-16 23:35:17,111 - main - WARNING - Could not load custom vocabulary: property 'vocab' of 'GPT2TokenizerFast' object has no setter 2024-11-16 23:35:17,111 - main - INFO - Loading model... 2024-11-16 23:35:18,795 - main - ERROR - Error initializing model: Unable to load weights from pytorch checkpoint file for './models/pytorch_model.bin' at './models/pytorch_model.bin'. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True. 2024-11-16 23:35:18,796 - main - ERROR - Detailed traceback: Traceback (most recent call last): File "e:\Self Work\My Projects\Poetica HuggingFace Server\.venv\Lib\site-packages\transformers\modeling_utils.py", line 575, in load_state_dict return torch.load( ^^^^^^^^^^^ File "e:\Self Work\My Projects\Poetica HuggingFace Server\.venv\Lib\site-packages\torch\serialization.py", line 1024, in load raise pickle.UnpicklingError(UNSAFE_MESSAGE + str(e)) from None _pickle.UnpicklingError: Weights only load failed. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution.Do it only if you get the file from a trusted source. WeightsUnpickler error: Unsupported class torch.qint8 During handling of the above exception, another exception occurred: Traceback (most recent call last): File "e:\Self Work\My Projects\Poetica HuggingFace Server\.venv\Lib\site-packages\transformers\modeling_utils.py", line 584, in load_state_dict if f.read(7) == "version": ^^^^^^^^^ File "D:\Program Files\Python\Lib\encodings\cp1252.py", line 23, in decode return codecs.charmap_decode(input,self.errors,decoding_table)[0] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 1651: character maps to During handling of the above exception, another exception occurred: Traceback (most recent call last): File "E:\Self Work\My Projects\Poetica HuggingFace Server\poetica\main.py", line 88, in initialize self.model = AutoModelForCausalLM.from_pretrained(model_path, local_files_only=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "e:\Self Work\My Projects\Poetica HuggingFace Server\.venv\Lib\site-packages\transformers\models\auto\auto_factory.py", line 564, in from_pretrained return model_class.from_pretrained( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "e:\Self Work\My Projects\Poetica HuggingFace Server\.venv\Lib\site-packages\transformers\modeling_utils.py", line 3703, in from_pretrained state_dict = load_state_dict(resolved_archive_file) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "e:\Self Work\My Projects\Poetica HuggingFace Server\.venv\Lib\site-packages\transformers\modeling_utils.py", line 596, in load_state_dict raise OSError( OSError: Unable to load weights from pytorch checkpoint file for './models/pytorch_model.bin' at './models/pytorch_model.bin'. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True. 2024-11-16 23:35:18,815 - main - ERROR - Failed to initialize model manager 2024-11-16 23:37:05,649 - main - INFO - Loading tokenizer... 2024-11-16 23:37:06,372 - main - INFO - Loading model... 2024-11-16 23:40:15,280 - main - ERROR - Error initializing model: Error(s) in loading state_dict for GPT2LMHeadModel: Missing key(s) in state_dict: "transformer.h.6.ln_1.weight", "transformer.h.6.ln_1.bias", "transformer.h.6.attn.c_attn.weight", "transformer.h.6.attn.c_attn.bias", "transformer.h.6.attn.c_proj.weight", "transformer.h.6.attn.c_proj.bias", "transformer.h.6.ln_2.weight", "transformer.h.6.ln_2.bias", "transformer.h.6.mlp.c_fc.weight", "transformer.h.6.mlp.c_fc.bias", "transformer.h.6.mlp.c_proj.weight", "transformer.h.6.mlp.c_proj.bias", "transformer.h.7.ln_1.weight", "transformer.h.7.ln_1.bias", "transformer.h.7.attn.c_attn.weight", "transformer.h.7.attn.c_attn.bias", "transformer.h.7.attn.c_proj.weight", "transformer.h.7.attn.c_proj.bias", "transformer.h.7.ln_2.weight", "transformer.h.7.ln_2.bias", "transformer.h.7.mlp.c_fc.weight", "transformer.h.7.mlp.c_fc.bias", "transformer.h.7.mlp.c_proj.weight", "transformer.h.7.mlp.c_proj.bias", "transformer.h.8.ln_1.weight", "transformer.h.8.ln_1.bias", "transformer.h.8.attn.c_attn.weight", "transformer.h.8.attn.c_attn.bias", "transformer.h.8.attn.c_proj.weight", "transformer.h.8.attn.c_proj.bias", "transformer.h.8.ln_2.weight", "transformer.h.8.ln_2.bias", "transformer.h.8.mlp.c_fc.weight", "transformer.h.8.mlp.c_fc.bias", "transformer.h.8.mlp.c_proj.weight", "transformer.h.8.mlp.c_proj.bias", "transformer.h.9.ln_1.weight", "transformer.h.9.ln_1.bias", "transformer.h.9.attn.c_attn.weight", "transformer.h.9.attn.c_attn.bias", "transformer.h.9.attn.c_proj.weight", "transformer.h.9.attn.c_proj.bias", "transformer.h.9.ln_2.weight", "transformer.h.9.ln_2.bias", "transformer.h.9.mlp.c_fc.weight", "transformer.h.9.mlp.c_fc.bias", "transformer.h.9.mlp.c_proj.weight", "transformer.h.9.mlp.c_proj.bias", "transformer.h.10.ln_1.weight", "transformer.h.10.ln_1.bias", "transformer.h.10.attn.c_attn.weight", "transformer.h.10.attn.c_attn.bias", "transformer.h.10.attn.c_proj.weight", "transformer.h.10.attn.c_proj.bias", "transformer.h.10.ln_2.weight", "transformer.h.10.ln_2.bias", "transformer.h.10.mlp.c_fc.weight", "transformer.h.10.mlp.c_fc.bias", "transformer.h.10.mlp.c_proj.weight", "transformer.h.10.mlp.c_proj.bias", "transformer.h.11.ln_1.weight", "transformer.h.11.ln_1.bias", "transformer.h.11.attn.c_attn.weight", "transformer.h.11.attn.c_attn.bias", "transformer.h.11.attn.c_proj.weight", "transformer.h.11.attn.c_proj.bias", "transformer.h.11.ln_2.weight", "transformer.h.11.ln_2.bias", "transformer.h.11.mlp.c_fc.weight", "transformer.h.11.mlp.c_fc.bias", "transformer.h.11.mlp.c_proj.weight", "transformer.h.11.mlp.c_proj.bias", "lm_head.weight". Unexpected key(s) in state_dict: "lm_head.scale", "lm_head.zero_point", "lm_head._packed_params.dtype", "lm_head._packed_params._packed_params". size mismatch for transformer.wte.weight: copying a param with shape torch.Size([50257, 384]) from checkpoint, the shape in current model is torch.Size([50257, 768]). size mismatch for transformer.wpe.weight: copying a param with shape torch.Size([128, 384]) from checkpoint, the shape in current model is torch.Size([1024, 768]). size mismatch for transformer.h.0.ln_1.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.0.ln_1.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.0.attn.c_attn.weight: copying a param with shape torch.Size([384, 1152]) from checkpoint, the shape in current model is torch.Size([768, 2304]). size mismatch for transformer.h.0.attn.c_attn.bias: copying a param with shape torch.Size([1152]) from checkpoint, the shape in current model is torch.Size([2304]). size mismatch for transformer.h.0.attn.c_proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([768, 768]). size mismatch for transformer.h.0.attn.c_proj.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.0.ln_2.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.0.ln_2.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.0.mlp.c_fc.weight: copying a param with shape torch.Size([384, 1536]) from checkpoint, the shape in current model is torch.Size([768, 3072]). size mismatch for transformer.h.0.mlp.c_fc.bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([3072]). size mismatch for transformer.h.0.mlp.c_proj.weight: copying a param with shape torch.Size([1536, 384]) from checkpoint, the shape in current model is torch.Size([3072, 768]). size mismatch for transformer.h.0.mlp.c_proj.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.1.ln_1.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.1.ln_1.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.1.attn.c_attn.weight: copying a param with shape torch.Size([384, 1152]) from checkpoint, the shape in current model is torch.Size([768, 2304]). size mismatch for transformer.h.1.attn.c_attn.bias: copying a param with shape torch.Size([1152]) from checkpoint, the shape in current model is torch.Size([2304]). size mismatch for transformer.h.1.attn.c_proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([768, 768]). size mismatch for transformer.h.1.attn.c_proj.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.1.ln_2.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.1.ln_2.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.1.mlp.c_fc.weight: copying a param with shape torch.Size([384, 1536]) from checkpoint, the shape in current model is torch.Size([768, 3072]). size mismatch for transformer.h.1.mlp.c_fc.bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([3072]). size mismatch for transformer.h.1.mlp.c_proj.weight: copying a param with shape torch.Size([1536, 384]) from checkpoint, the shape in current model is torch.Size([3072, 768]). size mismatch for transformer.h.1.mlp.c_proj.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.2.ln_1.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.2.ln_1.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.2.attn.c_attn.weight: copying a param with shape torch.Size([384, 1152]) from checkpoint, the shape in current model is torch.Size([768, 2304]). size mismatch for transformer.h.2.attn.c_attn.bias: copying a param with shape torch.Size([1152]) from checkpoint, the shape in current model is torch.Size([2304]). size mismatch for transformer.h.2.attn.c_proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([768, 768]). size mismatch for transformer.h.2.attn.c_proj.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.2.ln_2.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.2.ln_2.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.2.mlp.c_fc.weight: copying a param with shape torch.Size([384, 1536]) from checkpoint, the shape in current model is torch.Size([768, 3072]). size mismatch for transformer.h.2.mlp.c_fc.bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([3072]). size mismatch for transformer.h.2.mlp.c_proj.weight: copying a param with shape torch.Size([1536, 384]) from checkpoint, the shape in current model is torch.Size([3072, 768]). size mismatch for transformer.h.2.mlp.c_proj.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.3.ln_1.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.3.ln_1.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.3.attn.c_attn.weight: copying a param with shape torch.Size([384, 1152]) from checkpoint, the shape in current model is torch.Size([768, 2304]). size mismatch for transformer.h.3.attn.c_attn.bias: copying a param with shape torch.Size([1152]) from checkpoint, the shape in current model is torch.Size([2304]). size mismatch for transformer.h.3.attn.c_proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([768, 768]). size mismatch for transformer.h.3.attn.c_proj.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.3.ln_2.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.3.ln_2.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.3.mlp.c_fc.weight: copying a param with shape torch.Size([384, 1536]) from checkpoint, the shape in current model is torch.Size([768, 3072]). size mismatch for transformer.h.3.mlp.c_fc.bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([3072]). size mismatch for transformer.h.3.mlp.c_proj.weight: copying a param with shape torch.Size([1536, 384]) from checkpoint, the shape in current model is torch.Size([3072, 768]). size mismatch for transformer.h.3.mlp.c_proj.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.4.ln_1.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.4.ln_1.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.4.attn.c_attn.weight: copying a param with shape torch.Size([384, 1152]) from checkpoint, the shape in current model is torch.Size([768, 2304]). size mismatch for transformer.h.4.attn.c_attn.bias: copying a param with shape torch.Size([1152]) from checkpoint, the shape in current model is torch.Size([2304]). size mismatch for transformer.h.4.attn.c_proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([768, 768]). size mismatch for transformer.h.4.attn.c_proj.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.4.ln_2.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.4.ln_2.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.4.mlp.c_fc.weight: copying a param with shape torch.Size([384, 1536]) from checkpoint, the shape in current model is torch.Size([768, 3072]). size mismatch for transformer.h.4.mlp.c_fc.bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([3072]). size mismatch for transformer.h.4.mlp.c_proj.weight: copying a param with shape torch.Size([1536, 384]) from checkpoint, the shape in current model is torch.Size([3072, 768]). size mismatch for transformer.h.4.mlp.c_proj.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.5.ln_1.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.5.ln_1.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.5.attn.c_attn.weight: copying a param with shape torch.Size([384, 1152]) from checkpoint, the shape in current model is torch.Size([768, 2304]). size mismatch for transformer.h.5.attn.c_attn.bias: copying a param with shape torch.Size([1152]) from checkpoint, the shape in current model is torch.Size([2304]). size mismatch for transformer.h.5.attn.c_proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([768, 768]). size mismatch for transformer.h.5.attn.c_proj.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.5.ln_2.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.5.ln_2.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.5.mlp.c_fc.weight: copying a param with shape torch.Size([384, 1536]) from checkpoint, the shape in current model is torch.Size([768, 3072]). size mismatch for transformer.h.5.mlp.c_fc.bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([3072]). size mismatch for transformer.h.5.mlp.c_proj.weight: copying a param with shape torch.Size([1536, 384]) from checkpoint, the shape in current model is torch.Size([3072, 768]). size mismatch for transformer.h.5.mlp.c_proj.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.ln_f.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.ln_f.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). 2024-11-16 23:40:15,283 - main - ERROR - Detailed traceback: Traceback (most recent call last): File "E:\Self Work\My Projects\Poetica HuggingFace Server\poetica\main.py", line 74, in initialize self.model.load_state_dict(state_dict) File "e:\Self Work\My Projects\Poetica HuggingFace Server\.venv\Lib\site-packages\torch\nn\modules\module.py", line 2189, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for GPT2LMHeadModel: Missing key(s) in state_dict: "transformer.h.6.ln_1.weight", "transformer.h.6.ln_1.bias", "transformer.h.6.attn.c_attn.weight", "transformer.h.6.attn.c_attn.bias", "transformer.h.6.attn.c_proj.weight", "transformer.h.6.attn.c_proj.bias", "transformer.h.6.ln_2.weight", "transformer.h.6.ln_2.bias", "transformer.h.6.mlp.c_fc.weight", "transformer.h.6.mlp.c_fc.bias", "transformer.h.6.mlp.c_proj.weight", "transformer.h.6.mlp.c_proj.bias", "transformer.h.7.ln_1.weight", "transformer.h.7.ln_1.bias", "transformer.h.7.attn.c_attn.weight", "transformer.h.7.attn.c_attn.bias", "transformer.h.7.attn.c_proj.weight", "transformer.h.7.attn.c_proj.bias", "transformer.h.7.ln_2.weight", "transformer.h.7.ln_2.bias", "transformer.h.7.mlp.c_fc.weight", "transformer.h.7.mlp.c_fc.bias", "transformer.h.7.mlp.c_proj.weight", "transformer.h.7.mlp.c_proj.bias", "transformer.h.8.ln_1.weight", "transformer.h.8.ln_1.bias", "transformer.h.8.attn.c_attn.weight", "transformer.h.8.attn.c_attn.bias", "transformer.h.8.attn.c_proj.weight", "transformer.h.8.attn.c_proj.bias", "transformer.h.8.ln_2.weight", "transformer.h.8.ln_2.bias", "transformer.h.8.mlp.c_fc.weight", "transformer.h.8.mlp.c_fc.bias", "transformer.h.8.mlp.c_proj.weight", "transformer.h.8.mlp.c_proj.bias", "transformer.h.9.ln_1.weight", "transformer.h.9.ln_1.bias", "transformer.h.9.attn.c_attn.weight", "transformer.h.9.attn.c_attn.bias", "transformer.h.9.attn.c_proj.weight", "transformer.h.9.attn.c_proj.bias", "transformer.h.9.ln_2.weight", "transformer.h.9.ln_2.bias", "transformer.h.9.mlp.c_fc.weight", "transformer.h.9.mlp.c_fc.bias", "transformer.h.9.mlp.c_proj.weight", "transformer.h.9.mlp.c_proj.bias", "transformer.h.10.ln_1.weight", "transformer.h.10.ln_1.bias", "transformer.h.10.attn.c_attn.weight", "transformer.h.10.attn.c_attn.bias", "transformer.h.10.attn.c_proj.weight", "transformer.h.10.attn.c_proj.bias", "transformer.h.10.ln_2.weight", "transformer.h.10.ln_2.bias", "transformer.h.10.mlp.c_fc.weight", "transformer.h.10.mlp.c_fc.bias", "transformer.h.10.mlp.c_proj.weight", "transformer.h.10.mlp.c_proj.bias", "transformer.h.11.ln_1.weight", "transformer.h.11.ln_1.bias", "transformer.h.11.attn.c_attn.weight", "transformer.h.11.attn.c_attn.bias", "transformer.h.11.attn.c_proj.weight", "transformer.h.11.attn.c_proj.bias", "transformer.h.11.ln_2.weight", "transformer.h.11.ln_2.bias", "transformer.h.11.mlp.c_fc.weight", "transformer.h.11.mlp.c_fc.bias", "transformer.h.11.mlp.c_proj.weight", "transformer.h.11.mlp.c_proj.bias", "lm_head.weight". Unexpected key(s) in state_dict: "lm_head.scale", "lm_head.zero_point", "lm_head._packed_params.dtype", "lm_head._packed_params._packed_params". size mismatch for transformer.wte.weight: copying a param with shape torch.Size([50257, 384]) from checkpoint, the shape in current model is torch.Size([50257, 768]). size mismatch for transformer.wpe.weight: copying a param with shape torch.Size([128, 384]) from checkpoint, the shape in current model is torch.Size([1024, 768]). size mismatch for transformer.h.0.ln_1.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.0.ln_1.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.0.attn.c_attn.weight: copying a param with shape torch.Size([384, 1152]) from checkpoint, the shape in current model is torch.Size([768, 2304]). size mismatch for transformer.h.0.attn.c_attn.bias: copying a param with shape torch.Size([1152]) from checkpoint, the shape in current model is torch.Size([2304]). size mismatch for transformer.h.0.attn.c_proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([768, 768]). size mismatch for transformer.h.0.attn.c_proj.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.0.ln_2.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.0.ln_2.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.0.mlp.c_fc.weight: copying a param with shape torch.Size([384, 1536]) from checkpoint, the shape in current model is torch.Size([768, 3072]). size mismatch for transformer.h.0.mlp.c_fc.bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([3072]). size mismatch for transformer.h.0.mlp.c_proj.weight: copying a param with shape torch.Size([1536, 384]) from checkpoint, the shape in current model is torch.Size([3072, 768]). size mismatch for transformer.h.0.mlp.c_proj.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.1.ln_1.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.1.ln_1.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.1.attn.c_attn.weight: copying a param with shape torch.Size([384, 1152]) from checkpoint, the shape in current model is torch.Size([768, 2304]). size mismatch for transformer.h.1.attn.c_attn.bias: copying a param with shape torch.Size([1152]) from checkpoint, the shape in current model is torch.Size([2304]). size mismatch for transformer.h.1.attn.c_proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([768, 768]). size mismatch for transformer.h.1.attn.c_proj.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.1.ln_2.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.1.ln_2.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.1.mlp.c_fc.weight: copying a param with shape torch.Size([384, 1536]) from checkpoint, the shape in current model is torch.Size([768, 3072]). size mismatch for transformer.h.1.mlp.c_fc.bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([3072]). size mismatch for transformer.h.1.mlp.c_proj.weight: copying a param with shape torch.Size([1536, 384]) from checkpoint, the shape in current model is torch.Size([3072, 768]). size mismatch for transformer.h.1.mlp.c_proj.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.2.ln_1.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.2.ln_1.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.2.attn.c_attn.weight: copying a param with shape torch.Size([384, 1152]) from checkpoint, the shape in current model is torch.Size([768, 2304]). size mismatch for transformer.h.2.attn.c_attn.bias: copying a param with shape torch.Size([1152]) from checkpoint, the shape in current model is torch.Size([2304]). size mismatch for transformer.h.2.attn.c_proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([768, 768]). size mismatch for transformer.h.2.attn.c_proj.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.2.ln_2.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.2.ln_2.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.2.mlp.c_fc.weight: copying a param with shape torch.Size([384, 1536]) from checkpoint, the shape in current model is torch.Size([768, 3072]). size mismatch for transformer.h.2.mlp.c_fc.bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([3072]). size mismatch for transformer.h.2.mlp.c_proj.weight: copying a param with shape torch.Size([1536, 384]) from checkpoint, the shape in current model is torch.Size([3072, 768]). size mismatch for transformer.h.2.mlp.c_proj.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.3.ln_1.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.3.ln_1.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.3.attn.c_attn.weight: copying a param with shape torch.Size([384, 1152]) from checkpoint, the shape in current model is torch.Size([768, 2304]). size mismatch for transformer.h.3.attn.c_attn.bias: copying a param with shape torch.Size([1152]) from checkpoint, the shape in current model is torch.Size([2304]). size mismatch for transformer.h.3.attn.c_proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([768, 768]). size mismatch for transformer.h.3.attn.c_proj.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.3.ln_2.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.3.ln_2.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.3.mlp.c_fc.weight: copying a param with shape torch.Size([384, 1536]) from checkpoint, the shape in current model is torch.Size([768, 3072]). size mismatch for transformer.h.3.mlp.c_fc.bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([3072]). size mismatch for transformer.h.3.mlp.c_proj.weight: copying a param with shape torch.Size([1536, 384]) from checkpoint, the shape in current model is torch.Size([3072, 768]). size mismatch for transformer.h.3.mlp.c_proj.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.4.ln_1.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.4.ln_1.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.4.attn.c_attn.weight: copying a param with shape torch.Size([384, 1152]) from checkpoint, the shape in current model is torch.Size([768, 2304]). size mismatch for transformer.h.4.attn.c_attn.bias: copying a param with shape torch.Size([1152]) from checkpoint, the shape in current model is torch.Size([2304]). size mismatch for transformer.h.4.attn.c_proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([768, 768]). size mismatch for transformer.h.4.attn.c_proj.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.4.ln_2.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.4.ln_2.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.4.mlp.c_fc.weight: copying a param with shape torch.Size([384, 1536]) from checkpoint, the shape in current model is torch.Size([768, 3072]). size mismatch for transformer.h.4.mlp.c_fc.bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([3072]). size mismatch for transformer.h.4.mlp.c_proj.weight: copying a param with shape torch.Size([1536, 384]) from checkpoint, the shape in current model is torch.Size([3072, 768]). size mismatch for transformer.h.4.mlp.c_proj.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.5.ln_1.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.5.ln_1.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.5.attn.c_attn.weight: copying a param with shape torch.Size([384, 1152]) from checkpoint, the shape in current model is torch.Size([768, 2304]). size mismatch for transformer.h.5.attn.c_attn.bias: copying a param with shape torch.Size([1152]) from checkpoint, the shape in current model is torch.Size([2304]). size mismatch for transformer.h.5.attn.c_proj.weight: copying a param with shape torch.Size([384, 384]) from checkpoint, the shape in current model is torch.Size([768, 768]). size mismatch for transformer.h.5.attn.c_proj.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.5.ln_2.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.5.ln_2.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.h.5.mlp.c_fc.weight: copying a param with shape torch.Size([384, 1536]) from checkpoint, the shape in current model is torch.Size([768, 3072]). size mismatch for transformer.h.5.mlp.c_fc.bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([3072]). size mismatch for transformer.h.5.mlp.c_proj.weight: copying a param with shape torch.Size([1536, 384]) from checkpoint, the shape in current model is torch.Size([3072, 768]). size mismatch for transformer.h.5.mlp.c_proj.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.ln_f.weight: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). size mismatch for transformer.ln_f.bias: copying a param with shape torch.Size([384]) from checkpoint, the shape in current model is torch.Size([768]). 2024-11-16 23:40:15,287 - main - ERROR - Failed to initialize model manager 2024-11-16 23:45:40,456 - main - INFO - Loading tokenizer... 2024-11-16 23:45:41,738 - main - INFO - Loading model... 2024-11-16 23:45:42,454 - main - WARNING - Missing keys: ['lm_head.weight'] 2024-11-16 23:45:42,455 - main - WARNING - Unexpected keys: ['lm_head.scale', 'lm_head.zero_point', 'lm_head._packed_params.dtype', 'lm_head._packed_params._packed_params'] 2024-11-16 23:45:42,459 - main - INFO - Model and tokenizer loaded successfully