runtime error

nizer_config.json: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 727/727 [00:00<00:00, 4.68MB/s] Downloading tokenizer.model: 0%| | 0.00/500k [00:00<?, ?B/s] Downloading tokenizer.model: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 500k/500k [00:00<00:00, 55.3MB/s] Downloading (…)cial_tokens_map.json: 0%| | 0.00/411 [00:00<?, ?B/s] Downloading (…)cial_tokens_map.json: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 411/411 [00:00<00:00, 2.50MB/s] You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thouroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565 /home/user/.local/lib/python3.10/site-packages/transformers/modeling_utils.py:2480: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. warnings.warn( Downloading (…)lve/main/config.json: 0%| | 0.00/574 [00:00<?, ?B/s] Downloading (…)lve/main/config.json: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 574/574 [00:00<00:00, 3.79MB/s] Traceback (most recent call last): File "/home/user/app/app.py", line 64, in <module> model = model_cls.from_config(model_config).to('cuda:0') File "/home/user/app/minigpt4/models/mini_gpt4.py", line 239, in from_config model = cls( File "/home/user/app/minigpt4/models/mini_gpt4.py", line 98, in __init__ self.llama_model = LlamaForCausalLM.from_pretrained( File "/home/user/.local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2616, in from_pretrained raise ImportError( ImportError: Using `load_in_8bit=True` requires Accelerate: `pip install accelerate` and the latest version of bitsandbytes `pip install -i https://test.pypi.org/simple/ bitsandbytes` or pip install bitsandbytes`

Container logs:

Fetching error logs...