runtime error

Exit code: 1. Reason: <00:23, 23.13s/it] pytorch_model-00004-of-00004.bin: 0%| | 0.00/1.79G [00:00<?, ?B/s] pytorch_model-00004-of-00004.bin: 4%|▎ | 67.0M/1.79G [00:02<00:52, 32.8MB/s] pytorch_model-00004-of-00004.bin: 8%|▊ | 147M/1.79G [00:03<00:32, 50.9MB/s]  pytorch_model-00004-of-00004.bin: 14%|█▍ | 251M/1.79G [00:04<00:22, 69.8MB/s] pytorch_model-00004-of-00004.bin: 32%|███▏ | 573M/1.79G [00:06<00:10, 116MB/s]  pytorch_model-00004-of-00004.bin: 43%|████▎ | 774M/1.79G [00:07<00:07, 139MB/s] pytorch_model-00004-of-00004.bin: 100%|██████████| 1.79G/1.79G [00:08<00:00, 222MB/s] Downloading shards: 100%|██████████| 4/4 [01:18<00:00, 17.19s/it] Downloading shards: 100%|██████████| 4/4 [01:18<00:00, 19.57s/it] Traceback (most recent call last): File "/home/user/app/app.py", line 401, in <module> model, processor = model_init(args_cli.model_path) File "/home/user/app/rynnec/__init__.py", line 27, in model_init tokenizer, model, image_processor, context_len = load_pretrained_model(model_path, None, model_name, **kwargs) File "/home/user/app/rynnec/model/__init__.py", line 177, in load_pretrained_model model = RynnecQwen2ForCausalLM.from_pretrained(model_path, low_cpu_mem_usage=True, config=config, **kwargs) File "/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4091, in from_pretrained config = cls._autoset_attn_implementation( File "/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1617, in _autoset_attn_implementation cls._check_and_enable_flash_attn_2( File "/usr/local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1756, in _check_and_enable_flash_attn_2 raise ValueError( ValueError: FlashAttention2 has been toggled on, but it cannot be used due to the following error: Flash Attention 2 is not available on CPU. Please make sure torch can access a CUDA device.

Container logs:

Fetching error logs...