Spaces:
Running
on
T4
Not able to use it by Run With Docker
Hi there,
I have cloned the repo, and did all required installations, but on port 7860, I am not able to use the app.
It keeps on showing: "Downloading the models.." while in HF space, in place of it we have following in dropdown:
"llava-v1.5-13b-4bit"
Kindly help me with this, I plan to finetune it to compare the results to GPT4-V.
These are my logs
env-llava-3.10.4) ubuntu@ip-172-31-9-24:~/Repos/LLaVA$ python app.py
[2023-10-18 14:46:10,636] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect)
2023-10-18 14:46:11 | INFO | gradio_web_server | args: Namespace(host='0.0.0.0', port=None, controller_url='http://localhost:10000', concurrency_count=8, model_list_mode='reload', share=False, moderate=False, embed=False)
2023-10-18 14:46:11 | INFO | gradio_web_server | Starting the controller
2023-10-18 14:46:11 | INFO | gradio_web_server | Starting the model worker for the model liuhaotian/llava-v1.5-13b
[2023-10-18 14:46:14,739] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2023-10-18 14:46:14,748] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect)
2023-10-18 14:46:15 | INFO | model_worker | args: Namespace(host='0.0.0.0', port=21002, worker_address='http://localhost:21002', controller_address='http://localhost:10000', model_path='liuhaotian/llava-v1.5-13b', model_base=None, model_name='llava-v1.5-13b-4bit', multi_modal=False, limit_model_concurrency=5, stream_interval=1, no_register=False, load_8bit=False, load_4bit=True)
2023-10-18 14:46:15 | INFO | model_worker | Loading the model llava-v1.5-13b-4bit on worker dd86b8 ...
2023-10-18 14:46:15 | INFO | controller | args: Namespace(host='0.0.0.0', port=10000, dispatch_method='shortest_queue')
2023-10-18 14:46:15 | INFO | controller | Init controller
2023-10-18 14:46:15 | ERROR | stderr | INFO: Started server process [21719]
2023-10-18 14:46:15 | ERROR | stderr | INFO: Waiting for application startup.
2023-10-18 14:46:15 | ERROR | stderr | INFO: Application startup complete.
2023-10-18 14:46:15 | ERROR | stderr | INFO: Uvicorn running on http://0.0.0.0:10000 (Press CTRL+C to quit)
Loading checkpoint shards: 0%| | 0/3 [00:00<?, ?it/s]
2023-10-18 14:46:21 | INFO | stdout | INFO: 127.0.0.1:49682 - "POST /refresh_all_workers HTTP/1.1" 200 OK
2023-10-18 14:46:21 | INFO | stdout | INFO: 127.0.0.1:49698 - "POST /list_models HTTP/1.1" 200 OK
2023-10-18 14:46:21 | INFO | gradio_web_server | Models: []
2023-10-18 14:46:21 | INFO | stdout | Running on local URL: http://0.0.0.0:7860
2023-10-18 14:46:21 | INFO | stdout |
2023-10-18 14:46:21 | INFO | stdout | To create a public link, set `share=True` in `launch()`.
2023-10-18 14:46:59 | INFO | gradio_web_server | load_demo. ip: 203.99.184.129
2023-10-18 14:46:59 | INFO | stdout | INFO: 127.0.0.1:44730 - "POST /refresh_all_workers HTTP/1.1" 200 OK
2023-10-18 14:46:59 | INFO | stdout | INFO: 127.0.0.1:44732 - "POST /list_models HTTP/1.1" 200 OK
2023-10-18 14:46:59 | INFO | gradio_web_server | Models: []
Thanks for help
hey
@m-ali-awan
!
one of the processes is downloading the model in the background. that's why it keeps showing "Downloading the models..".
Your logs should be displaying the download progress, and it takes sometime to start the download. Can you wait for some time ~3 mins and check whether the download progress is available in the logs.
Thanks
@badayvedat
Can't I pass this path, as I think it has downloaded the models here
/home/ubuntu/.cache/huggingface/hub/models--liuhaotian--llava-v1.5-13b/snapshots/d64eb781be6876a5facc160ab1899281f59ef684/pytorch_model-00003-of-00003.bin
And, I waited for some time, I can't see any directory like : "liuhaotian/llava-v1.5-13b". gets created
And, I don't know what is this, but it gets stucked on this
Loading checkpoint shards: 0%| | 0/3 [00:00<?, ?it/s]
And, I waited for some time, I can't see any directory like : "liuhaotian/llava-v1.5-13b". gets created
that's interesting 🤔
can you try with overriding the model_path
variable to point to /home/ubuntu/.cache/huggingface/hub/models--liuhaotian--llava-v1.5-13b
yes, I did that as well :)
Thanks , it worked now. But previously I tried by setting
export bit=4
And this time, I didn't do that perhaps that's the difference
Thanks , it worked now
great!
since bit=4
make use of bitsandbytes
, there might be an issue with the docker environment <=> cuda <=> bitsandbytes
Great, thanks
is it expected to have a performance degradation with bits=4, as compared to bit=8?
yes, performance degradation is expected, but I'm not aware of any LLaVA research that compares the performance metrics of the various quantizations (4/8/16/32) in detail.
maybe
@liuhaotian
can provide more info on that?
Sure, I will also try to document some comparisons, and post here.
For finetuning, I saw lot of people were facing issues. Is there any demo doc, which I can follow.
Moreover would be grateful, if you can give me some rough guess/estimate, about the size of my training dataset.
I want to fnietune LLava for Vehicle Damage Estimation, so it can tell different damages with localization, i.e: PROMPT:give me a damage estimation of this image of vehicle >. Response: we have visible parts: rear bumper, rear left quater panel, we have one significant dent and 2 small scratches on rear bumper, etc
What do you think, how much samples, should I use. And based on your experience any guidelines/tricks I should follow..
Thanks alot for all your help