InternVL

Running

App Files Files Community

InternVL / logs /model_worker_4ae09d.log

czczup

Upload folder using huggingface_hub

3f1b7f0 verified 5 months ago

raw

history blame

11.6 kB

	2024-07-11 22:35:40 \| INFO \| model_worker \| args: Namespace(host='0.0.0.0', port=40007, worker_address='http://10.140.66.196:40007', controller_address='http://10.140.60.209:10075', model_path='share_internvl/InternVL2-78B/', model_name=None, device='auto', limit_model_concurrency=5, stream_interval=1, load_8bit=False)
	2024-07-11 22:35:40 \| INFO \| model_worker \| Loading the model InternVL2-78B on worker 4ae09d ...
	2024-07-11 22:35:40 \| WARNING \| transformers.tokenization_utils_base \| Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
	2024-07-11 22:35:40 \| WARNING \| transformers.tokenization_utils_base \| Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
	2024-07-11 22:35:44 \| ERROR \| stderr \| Loading checkpoint shards: 0%\| \| 0/33 [00:00<?, ?it/s]
	2024-07-11 22:35:46 \| ERROR \| stderr \| Loading checkpoint shards: 3%\|▎ \| 1/33 [00:02<01:04, 2.02s/it]
	2024-07-11 22:35:48 \| ERROR \| stderr \| Loading checkpoint shards: 6%\|▌ \| 2/33 [00:03<01:01, 1.97s/it]
	2024-07-11 22:35:50 \| ERROR \| stderr \| Loading checkpoint shards: 9%\|▉ \| 3/33 [00:05<00:58, 1.96s/it]
	2024-07-11 22:35:52 \| ERROR \| stderr \| Loading checkpoint shards: 12%\|█▏ \| 4/33 [00:07<00:56, 1.94s/it]
	2024-07-11 22:35:54 \| ERROR \| stderr \| Loading checkpoint shards: 15%\|█▌ \| 5/33 [00:09<00:53, 1.92s/it]
	2024-07-11 22:35:56 \| ERROR \| stderr \| Loading checkpoint shards: 18%\|█▊ \| 6/33 [00:11<00:52, 1.96s/it]
	2024-07-11 22:35:58 \| ERROR \| stderr \| Loading checkpoint shards: 21%\|██ \| 7/33 [00:13<00:50, 1.96s/it]
	2024-07-11 22:36:00 \| ERROR \| stderr \| Loading checkpoint shards: 24%\|██▍ \| 8/33 [00:15<00:48, 1.92s/it]
	2024-07-11 22:36:02 \| ERROR \| stderr \| Loading checkpoint shards: 27%\|██▋ \| 9/33 [00:17<00:46, 1.93s/it]
	2024-07-11 22:36:04 \| ERROR \| stderr \| Loading checkpoint shards: 30%\|███ \| 10/33 [00:19<00:44, 1.94s/it]
	2024-07-11 22:36:05 \| ERROR \| stderr \| Loading checkpoint shards: 33%\|███▎ \| 11/33 [00:21<00:42, 1.93s/it]
	2024-07-11 22:36:07 \| ERROR \| stderr \| Loading checkpoint shards: 36%\|███▋ \| 12/33 [00:23<00:40, 1.91s/it]
	2024-07-11 22:36:09 \| ERROR \| stderr \| Loading checkpoint shards: 39%\|███▉ \| 13/33 [00:25<00:37, 1.90s/it]
	2024-07-11 22:36:11 \| ERROR \| stderr \| Loading checkpoint shards: 42%\|████▏ \| 14/33 [00:26<00:36, 1.90s/it]
	2024-07-11 22:36:13 \| ERROR \| stderr \| Loading checkpoint shards: 45%\|████▌ \| 15/33 [00:28<00:33, 1.88s/it]
	2024-07-11 22:36:15 \| ERROR \| stderr \| Loading checkpoint shards: 48%\|████▊ \| 16/33 [00:30<00:31, 1.87s/it]
	2024-07-11 22:36:17 \| ERROR \| stderr \| Loading checkpoint shards: 52%\|█████▏ \| 17/33 [00:32<00:29, 1.86s/it]
	2024-07-11 22:36:19 \| ERROR \| stderr \| Loading checkpoint shards: 55%\|█████▍ \| 18/33 [00:34<00:28, 1.89s/it]
	2024-07-11 22:36:20 \| ERROR \| stderr \| Loading checkpoint shards: 58%\|█████▊ \| 19/33 [00:36<00:26, 1.87s/it]
	2024-07-11 22:36:22 \| ERROR \| stderr \| Loading checkpoint shards: 61%\|██████ \| 20/33 [00:38<00:24, 1.87s/it]
	2024-07-11 22:36:24 \| ERROR \| stderr \| Loading checkpoint shards: 64%\|██████▎ \| 21/33 [00:40<00:22, 1.89s/it]
	2024-07-11 22:36:26 \| ERROR \| stderr \| Loading checkpoint shards: 67%\|██████▋ \| 22/33 [00:42<00:21, 1.91s/it]
	2024-07-11 22:36:28 \| ERROR \| stderr \| Loading checkpoint shards: 70%\|██████▉ \| 23/33 [00:43<00:19, 1.92s/it]
	2024-07-11 22:36:30 \| ERROR \| stderr \| Loading checkpoint shards: 73%\|███████▎ \| 24/33 [00:46<00:17, 1.97s/it]
	2024-07-11 22:36:32 \| ERROR \| stderr \| Loading checkpoint shards: 76%\|███████▌ \| 25/33 [00:48<00:16, 2.03s/it]
	2024-07-11 22:36:34 \| ERROR \| stderr \| Loading checkpoint shards: 79%\|███████▉ \| 26/33 [00:50<00:14, 2.03s/it]
	2024-07-11 22:36:36 \| ERROR \| stderr \| Loading checkpoint shards: 82%\|████████▏ \| 27/33 [00:52<00:12, 2.02s/it]
	2024-07-11 22:36:38 \| ERROR \| stderr \| Loading checkpoint shards: 85%\|████████▍ \| 28/33 [00:54<00:10, 2.02s/it]
	2024-07-11 22:36:40 \| ERROR \| stderr \| Loading checkpoint shards: 88%\|████████▊ \| 29/33 [00:56<00:08, 2.04s/it]
	2024-07-11 22:36:43 \| ERROR \| stderr \| Loading checkpoint shards: 91%\|█████████ \| 30/33 [00:58<00:06, 2.12s/it]
	2024-07-11 22:36:45 \| ERROR \| stderr \| Loading checkpoint shards: 94%\|█████████▍\| 31/33 [01:00<00:04, 2.13s/it]
	2024-07-11 22:36:47 \| ERROR \| stderr \| Loading checkpoint shards: 97%\|█████████▋\| 32/33 [01:02<00:02, 2.04s/it]
	2024-07-11 22:36:48 \| ERROR \| stderr \| Loading checkpoint shards: 100%\|██████████\| 33/33 [01:03<00:00, 1.81s/it]
	2024-07-11 22:36:48 \| ERROR \| stderr \| Loading checkpoint shards: 100%\|██████████\| 33/33 [01:03<00:00, 1.94s/it]
	2024-07-11 22:36:48 \| ERROR \| stderr \|
	2024-07-11 22:36:49 \| INFO \| model_worker \| Register to controller
	2024-07-11 22:36:49 \| ERROR \| stderr \| INFO: Started server process [50489]
	2024-07-11 22:36:49 \| ERROR \| stderr \| INFO: Waiting for application startup.
	2024-07-11 22:36:49 \| ERROR \| stderr \| INFO: Application startup complete.
	2024-07-11 22:36:49 \| ERROR \| stderr \| INFO: Uvicorn running on http://0.0.0.0:40007 (Press CTRL+C to quit)
	2024-07-11 22:36:57 \| INFO \| stdout \| INFO: 10.140.60.209:45960 - "POST /worker_get_status HTTP/1.1" 200 OK
	2024-07-11 22:37:00 \| INFO \| stdout \| INFO: 10.140.60.209:45984 - "POST /worker_get_status HTTP/1.1" 200 OK
	2024-07-11 22:37:01 \| INFO \| stdout \| INFO: 10.140.60.209:46004 - "POST /worker_get_status HTTP/1.1" 200 OK
	2024-07-11 22:37:01 \| INFO \| model_worker \| Send heart beat. Models: ['InternVL2-78B']. Semaphore: Semaphore(value=4, locked=False). global_counter: 1
	2024-07-11 22:37:01 \| INFO \| stdout \| INFO: 10.140.60.209:46012 - "POST /worker_generate_stream HTTP/1.1" 200 OK
	2024-07-11 22:37:01 \| INFO \| model_worker \| max_input_tile_list: [12]
	2024-07-11 22:37:01 \| INFO \| model_worker \| Split images to torch.Size([13, 3, 448, 448])
	2024-07-11 22:37:01 \| INFO \| model_worker \| []
	2024-07-11 22:37:01 \| INFO \| model_worker \| Generation config: {'num_beams': 1, 'max_new_tokens': 2048, 'do_sample': True, 'temperature': 0.8, 'repetition_penalty': 1.1, 'max_length': 8192, 'top_p': 0.7, 'streamer': <transformers.generation.streamers.TextIteratorStreamer object at 0x7f57a8091240>}
	2024-07-11 22:37:03 \| WARNING \| transformers.generation.utils \| Setting `pad_token_id` to `eos_token_id`:151645 for open-end generation.
	2024-07-11 22:37:03 \| WARNING \| transformers.generation.utils \| Both `max_new_tokens` (=2048) and `max_length`(=8192) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)
	2024-07-11 22:37:04 \| INFO \| model_worker \| Send heart beat. Models: ['InternVL2-78B']. Semaphore: Semaphore(value=4, locked=False). global_counter: 1
	2024-07-11 22:37:08 \| ERROR \| stderr \| Exception in thread Thread-3 (chat):
	2024-07-11 22:37:08 \| ERROR \| stderr \| Traceback (most recent call last):
	2024-07-11 22:37:08 \| ERROR \| stderr \| File "/mnt/petrelfs/wangweiyun/miniconda3/envs/internvl-apex/lib/python3.10/threading.py", line 1009, in _bootstrap_inner
	2024-07-11 22:37:08 \| ERROR \| stderr \| self.run()
	2024-07-11 22:37:08 \| ERROR \| stderr \| File "/mnt/petrelfs/wangweiyun/miniconda3/envs/internvl-apex/lib/python3.10/threading.py", line 946, in run
	2024-07-11 22:37:08 \| ERROR \| stderr \| self._target(self._args, *self._kwargs)
	2024-07-11 22:37:08 \| ERROR \| stderr \| File "/mnt/petrelfs/wangweiyun/.cache/huggingface/modules/transformers_modules/InternVL2-78B/modeling_internvl_chat.py", line 283, in chat
	2024-07-11 22:37:08 \| ERROR \| stderr \| generation_output = self.generate(
	2024-07-11 22:37:08 \| ERROR \| stderr \| File "/mnt/petrelfs/wangweiyun/miniconda3/envs/internvl-apex/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
	2024-07-11 22:37:08 \| ERROR \| stderr \| return func(args, *kwargs)
	2024-07-11 22:37:08 \| ERROR \| stderr \| File "/mnt/petrelfs/wangweiyun/.cache/huggingface/modules/transformers_modules/InternVL2-78B/modeling_internvl_chat.py", line 333, in generate
	2024-07-11 22:37:08 \| ERROR \| stderr \| outputs = self.language_model.generate(
	2024-07-11 22:37:08 \| ERROR \| stderr \| File "/mnt/petrelfs/wangweiyun/miniconda3/envs/internvl-apex/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
	2024-07-11 22:37:08 \| ERROR \| stderr \| return func(args, *kwargs)
	2024-07-11 22:37:08 \| ERROR \| stderr \| File "/mnt/petrelfs/wangweiyun/miniconda3/envs/internvl-apex/lib/python3.10/site-packages/transformers/generation/utils.py", line 1525, in generate
	2024-07-11 22:37:08 \| ERROR \| stderr \| return self.sample(
	2024-07-11 22:37:08 \| ERROR \| stderr \| File "/mnt/petrelfs/wangweiyun/miniconda3/envs/internvl-apex/lib/python3.10/site-packages/transformers/generation/utils.py", line 2641, in sample
	2024-07-11 22:37:08 \| ERROR \| stderr \| next_token_scores = logits_processor(input_ids, next_token_logits)
	2024-07-11 22:37:08 \| ERROR \| stderr \| File "/mnt/petrelfs/wangweiyun/miniconda3/envs/internvl-apex/lib/python3.10/site-packages/transformers/generation/logits_process.py", line 97, in __call__
	2024-07-11 22:37:08 \| ERROR \| stderr \| scores = processor(input_ids, scores)
	2024-07-11 22:37:08 \| ERROR \| stderr \| File "/mnt/petrelfs/wangweiyun/miniconda3/envs/internvl-apex/lib/python3.10/site-packages/transformers/generation/logits_process.py", line 333, in __call__
	2024-07-11 22:37:08 \| ERROR \| stderr \| score = torch.gather(scores, 1, input_ids)
	2024-07-11 22:37:08 \| ERROR \| stderr \| RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:4 and cuda:0! (when checking argument for argument index in method wrapper_CUDA_gather)
	2024-07-11 22:37:12 \| INFO \| model_worker \| Send heart beat. Models: ['InternVL2-78B']. Semaphore: Semaphore(value=5, locked=False). global_counter: 1
	2024-07-11 22:37:19 \| INFO \| model_worker \| Send heart beat. Models: ['InternVL2-78B']. Semaphore: Semaphore(value=5, locked=False). global_counter: 1
	2024-07-11 22:37:34 \| INFO \| model_worker \| Send heart beat. Models: ['InternVL2-78B']. Semaphore: Semaphore(value=5, locked=False). global_counter: 1
	2024-07-11 22:37:49 \| INFO \| model_worker \| Send heart beat. Models: ['InternVL2-78B']. Semaphore: Semaphore(value=5, locked=False). global_counter: 1
	2024-07-11 22:38:04 \| INFO \| model_worker \| Send heart beat. Models: ['InternVL2-78B']. Semaphore: Semaphore(value=5, locked=False). global_counter: 1
	2024-07-11 22:38:19 \| INFO \| model_worker \| Send heart beat. Models: ['InternVL2-78B']. Semaphore: Semaphore(value=5, locked=False). global_counter: 1
	2024-07-11 22:38:34 \| INFO \| model_worker \| Send heart beat. Models: ['InternVL2-78B']. Semaphore: Semaphore(value=5, locked=False). global_counter: 1
	2024-07-11 22:38:49 \| INFO \| model_worker \| Send heart beat. Models: ['InternVL2-78B']. Semaphore: Semaphore(value=5, locked=False). global_counter: 1

	2024-07-11 22:35:40 \| INFO \| model_worker \| args: Namespace(host='0.0.0.0', port=40007, worker_address='http://10.140.66.196:40007', controller_address='http://10.140.60.209:10075', model_path='share_internvl/InternVL2-78B/', model_name=None, device='auto', limit_model_concurrency=5, stream_interval=1, load_8bit=False)
	2024-07-11 22:35:40 \| INFO \| model_worker \| Loading the model InternVL2-78B on worker 4ae09d ...
	2024-07-11 22:35:40 \| WARNING \| transformers.tokenization_utils_base \| Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
	2024-07-11 22:35:40 \| WARNING \| transformers.tokenization_utils_base \| Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
	2024-07-11 22:35:44 \| ERROR \| stderr \| Loading checkpoint shards: 0%\| \| 0/33 [00:00<?, ?it/s]
	2024-07-11 22:35:46 \| ERROR \| stderr \| Loading checkpoint shards: 3%\|▎ \| 1/33 [00:02<01:04, 2.02s/it]
	2024-07-11 22:35:48 \| ERROR \| stderr \| Loading checkpoint shards: 6%\|▌ \| 2/33 [00:03<01:01, 1.97s/it]
	2024-07-11 22:35:50 \| ERROR \| stderr \| Loading checkpoint shards: 9%\|▉ \| 3/33 [00:05<00:58, 1.96s/it]
	2024-07-11 22:35:52 \| ERROR \| stderr \| Loading checkpoint shards: 12%\|█▏ \| 4/33 [00:07<00:56, 1.94s/it]
	2024-07-11 22:35:54 \| ERROR \| stderr \| Loading checkpoint shards: 15%\|█▌ \| 5/33 [00:09<00:53, 1.92s/it]
	2024-07-11 22:35:56 \| ERROR \| stderr \| Loading checkpoint shards: 18%\|█▊ \| 6/33 [00:11<00:52, 1.96s/it]
	2024-07-11 22:35:58 \| ERROR \| stderr \| Loading checkpoint shards: 21%\|██ \| 7/33 [00:13<00:50, 1.96s/it]
	2024-07-11 22:36:00 \| ERROR \| stderr \| Loading checkpoint shards: 24%\|██▍ \| 8/33 [00:15<00:48, 1.92s/it]
	2024-07-11 22:36:02 \| ERROR \| stderr \| Loading checkpoint shards: 27%\|██▋ \| 9/33 [00:17<00:46, 1.93s/it]
	2024-07-11 22:36:04 \| ERROR \| stderr \| Loading checkpoint shards: 30%\|███ \| 10/33 [00:19<00:44, 1.94s/it]
	2024-07-11 22:36:05 \| ERROR \| stderr \| Loading checkpoint shards: 33%\|███▎ \| 11/33 [00:21<00:42, 1.93s/it]
	2024-07-11 22:36:07 \| ERROR \| stderr \| Loading checkpoint shards: 36%\|███▋ \| 12/33 [00:23<00:40, 1.91s/it]
	2024-07-11 22:36:09 \| ERROR \| stderr \| Loading checkpoint shards: 39%\|███▉ \| 13/33 [00:25<00:37, 1.90s/it]
	2024-07-11 22:36:11 \| ERROR \| stderr \| Loading checkpoint shards: 42%\|████▏ \| 14/33 [00:26<00:36, 1.90s/it]
	2024-07-11 22:36:13 \| ERROR \| stderr \| Loading checkpoint shards: 45%\|████▌ \| 15/33 [00:28<00:33, 1.88s/it]
	2024-07-11 22:36:15 \| ERROR \| stderr \| Loading checkpoint shards: 48%\|████▊ \| 16/33 [00:30<00:31, 1.87s/it]
	2024-07-11 22:36:17 \| ERROR \| stderr \| Loading checkpoint shards: 52%\|█████▏ \| 17/33 [00:32<00:29, 1.86s/it]
	2024-07-11 22:36:19 \| ERROR \| stderr \| Loading checkpoint shards: 55%\|█████▍ \| 18/33 [00:34<00:28, 1.89s/it]
	2024-07-11 22:36:20 \| ERROR \| stderr \| Loading checkpoint shards: 58%\|█████▊ \| 19/33 [00:36<00:26, 1.87s/it]
	2024-07-11 22:36:22 \| ERROR \| stderr \| Loading checkpoint shards: 61%\|██████ \| 20/33 [00:38<00:24, 1.87s/it]
	2024-07-11 22:36:24 \| ERROR \| stderr \| Loading checkpoint shards: 64%\|██████▎ \| 21/33 [00:40<00:22, 1.89s/it]
	2024-07-11 22:36:26 \| ERROR \| stderr \| Loading checkpoint shards: 67%\|██████▋ \| 22/33 [00:42<00:21, 1.91s/it]
	2024-07-11 22:36:28 \| ERROR \| stderr \| Loading checkpoint shards: 70%\|██████▉ \| 23/33 [00:43<00:19, 1.92s/it]
	2024-07-11 22:36:30 \| ERROR \| stderr \| Loading checkpoint shards: 73%\|███████▎ \| 24/33 [00:46<00:17, 1.97s/it]
	2024-07-11 22:36:32 \| ERROR \| stderr \| Loading checkpoint shards: 76%\|███████▌ \| 25/33 [00:48<00:16, 2.03s/it]
	2024-07-11 22:36:34 \| ERROR \| stderr \| Loading checkpoint shards: 79%\|███████▉ \| 26/33 [00:50<00:14, 2.03s/it]
	2024-07-11 22:36:36 \| ERROR \| stderr \| Loading checkpoint shards: 82%\|████████▏ \| 27/33 [00:52<00:12, 2.02s/it]
	2024-07-11 22:36:38 \| ERROR \| stderr \| Loading checkpoint shards: 85%\|████████▍ \| 28/33 [00:54<00:10, 2.02s/it]
	2024-07-11 22:36:40 \| ERROR \| stderr \| Loading checkpoint shards: 88%\|████████▊ \| 29/33 [00:56<00:08, 2.04s/it]
	2024-07-11 22:36:43 \| ERROR \| stderr \| Loading checkpoint shards: 91%\|█████████ \| 30/33 [00:58<00:06, 2.12s/it]
	2024-07-11 22:36:45 \| ERROR \| stderr \| Loading checkpoint shards: 94%\|█████████▍\| 31/33 [01:00<00:04, 2.13s/it]
	2024-07-11 22:36:47 \| ERROR \| stderr \| Loading checkpoint shards: 97%\|█████████▋\| 32/33 [01:02<00:02, 2.04s/it]
	2024-07-11 22:36:48 \| ERROR \| stderr \| Loading checkpoint shards: 100%\|██████████\| 33/33 [01:03<00:00, 1.81s/it]
	2024-07-11 22:36:48 \| ERROR \| stderr \| Loading checkpoint shards: 100%\|██████████\| 33/33 [01:03<00:00, 1.94s/it]
	2024-07-11 22:36:48 \| ERROR \| stderr \|
	2024-07-11 22:36:49 \| INFO \| model_worker \| Register to controller
	2024-07-11 22:36:49 \| ERROR \| stderr \| INFO: Started server process [50489]
	2024-07-11 22:36:49 \| ERROR \| stderr \| INFO: Waiting for application startup.
	2024-07-11 22:36:49 \| ERROR \| stderr \| INFO: Application startup complete.
	2024-07-11 22:36:49 \| ERROR \| stderr \| INFO: Uvicorn running on http://0.0.0.0:40007 (Press CTRL+C to quit)
	2024-07-11 22:36:57 \| INFO \| stdout \| INFO: 10.140.60.209:45960 - "POST /worker_get_status HTTP/1.1" 200 OK
	2024-07-11 22:37:00 \| INFO \| stdout \| INFO: 10.140.60.209:45984 - "POST /worker_get_status HTTP/1.1" 200 OK
	2024-07-11 22:37:01 \| INFO \| stdout \| INFO: 10.140.60.209:46004 - "POST /worker_get_status HTTP/1.1" 200 OK
	2024-07-11 22:37:01 \| INFO \| model_worker \| Send heart beat. Models: ['InternVL2-78B']. Semaphore: Semaphore(value=4, locked=False). global_counter: 1
	2024-07-11 22:37:01 \| INFO \| stdout \| INFO: 10.140.60.209:46012 - "POST /worker_generate_stream HTTP/1.1" 200 OK
	2024-07-11 22:37:01 \| INFO \| model_worker \| max_input_tile_list: [12]
	2024-07-11 22:37:01 \| INFO \| model_worker \| Split images to torch.Size([13, 3, 448, 448])
	2024-07-11 22:37:01 \| INFO \| model_worker \| []
	2024-07-11 22:37:01 \| INFO \| model_worker \| Generation config: {'num_beams': 1, 'max_new_tokens': 2048, 'do_sample': True, 'temperature': 0.8, 'repetition_penalty': 1.1, 'max_length': 8192, 'top_p': 0.7, 'streamer': <transformers.generation.streamers.TextIteratorStreamer object at 0x7f57a8091240>}
	2024-07-11 22:37:03 \| WARNING \| transformers.generation.utils \| Setting `pad_token_id` to `eos_token_id`:151645 for open-end generation.
	2024-07-11 22:37:03 \| WARNING \| transformers.generation.utils \| Both `max_new_tokens` (=2048) and `max_length`(=8192) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)
	2024-07-11 22:37:04 \| INFO \| model_worker \| Send heart beat. Models: ['InternVL2-78B']. Semaphore: Semaphore(value=4, locked=False). global_counter: 1
	2024-07-11 22:37:08 \| ERROR \| stderr \| Exception in thread Thread-3 (chat):
	2024-07-11 22:37:08 \| ERROR \| stderr \| Traceback (most recent call last):
	2024-07-11 22:37:08 \| ERROR \| stderr \| File "/mnt/petrelfs/wangweiyun/miniconda3/envs/internvl-apex/lib/python3.10/threading.py", line 1009, in _bootstrap_inner
	2024-07-11 22:37:08 \| ERROR \| stderr \| self.run()
	2024-07-11 22:37:08 \| ERROR \| stderr \| File "/mnt/petrelfs/wangweiyun/miniconda3/envs/internvl-apex/lib/python3.10/threading.py", line 946, in run
	2024-07-11 22:37:08 \| ERROR \| stderr \| self._target(self._args, *self._kwargs)
	2024-07-11 22:37:08 \| ERROR \| stderr \| File "/mnt/petrelfs/wangweiyun/.cache/huggingface/modules/transformers_modules/InternVL2-78B/modeling_internvl_chat.py", line 283, in chat
	2024-07-11 22:37:08 \| ERROR \| stderr \| generation_output = self.generate(
	2024-07-11 22:37:08 \| ERROR \| stderr \| File "/mnt/petrelfs/wangweiyun/miniconda3/envs/internvl-apex/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
	2024-07-11 22:37:08 \| ERROR \| stderr \| return func(args, *kwargs)
	2024-07-11 22:37:08 \| ERROR \| stderr \| File "/mnt/petrelfs/wangweiyun/.cache/huggingface/modules/transformers_modules/InternVL2-78B/modeling_internvl_chat.py", line 333, in generate
	2024-07-11 22:37:08 \| ERROR \| stderr \| outputs = self.language_model.generate(
	2024-07-11 22:37:08 \| ERROR \| stderr \| File "/mnt/petrelfs/wangweiyun/miniconda3/envs/internvl-apex/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
	2024-07-11 22:37:08 \| ERROR \| stderr \| return func(args, *kwargs)
	2024-07-11 22:37:08 \| ERROR \| stderr \| File "/mnt/petrelfs/wangweiyun/miniconda3/envs/internvl-apex/lib/python3.10/site-packages/transformers/generation/utils.py", line 1525, in generate
	2024-07-11 22:37:08 \| ERROR \| stderr \| return self.sample(
	2024-07-11 22:37:08 \| ERROR \| stderr \| File "/mnt/petrelfs/wangweiyun/miniconda3/envs/internvl-apex/lib/python3.10/site-packages/transformers/generation/utils.py", line 2641, in sample
	2024-07-11 22:37:08 \| ERROR \| stderr \| next_token_scores = logits_processor(input_ids, next_token_logits)
	2024-07-11 22:37:08 \| ERROR \| stderr \| File "/mnt/petrelfs/wangweiyun/miniconda3/envs/internvl-apex/lib/python3.10/site-packages/transformers/generation/logits_process.py", line 97, in __call__
	2024-07-11 22:37:08 \| ERROR \| stderr \| scores = processor(input_ids, scores)
	2024-07-11 22:37:08 \| ERROR \| stderr \| File "/mnt/petrelfs/wangweiyun/miniconda3/envs/internvl-apex/lib/python3.10/site-packages/transformers/generation/logits_process.py", line 333, in __call__
	2024-07-11 22:37:08 \| ERROR \| stderr \| score = torch.gather(scores, 1, input_ids)
	2024-07-11 22:37:08 \| ERROR \| stderr \| RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:4 and cuda:0! (when checking argument for argument index in method wrapper_CUDA_gather)
	2024-07-11 22:37:12 \| INFO \| model_worker \| Send heart beat. Models: ['InternVL2-78B']. Semaphore: Semaphore(value=5, locked=False). global_counter: 1
	2024-07-11 22:37:19 \| INFO \| model_worker \| Send heart beat. Models: ['InternVL2-78B']. Semaphore: Semaphore(value=5, locked=False). global_counter: 1
	2024-07-11 22:37:34 \| INFO \| model_worker \| Send heart beat. Models: ['InternVL2-78B']. Semaphore: Semaphore(value=5, locked=False). global_counter: 1
	2024-07-11 22:37:49 \| INFO \| model_worker \| Send heart beat. Models: ['InternVL2-78B']. Semaphore: Semaphore(value=5, locked=False). global_counter: 1
	2024-07-11 22:38:04 \| INFO \| model_worker \| Send heart beat. Models: ['InternVL2-78B']. Semaphore: Semaphore(value=5, locked=False). global_counter: 1
	2024-07-11 22:38:19 \| INFO \| model_worker \| Send heart beat. Models: ['InternVL2-78B']. Semaphore: Semaphore(value=5, locked=False). global_counter: 1
	2024-07-11 22:38:34 \| INFO \| model_worker \| Send heart beat. Models: ['InternVL2-78B']. Semaphore: Semaphore(value=5, locked=False). global_counter: 1
	2024-07-11 22:38:49 \| INFO \| model_worker \| Send heart beat. Models: ['InternVL2-78B']. Semaphore: Semaphore(value=5, locked=False). global_counter: 1