Working sample for mac

#11

by spawn99 - opened Sep 4, 2024

Sep 4, 2024

Here is a working example for use on apple silicon devices if it's of use to anyone.
https://gist.github.com/cavit99/811919b3e7753c925ab603b1929dbd99

NicodemPL

Sep 4, 2024

how much ram is required?

spawn99

Sep 4, 2024

this is at full precision.

NicodemPL

Sep 5, 2024

Nice. Well, just for fun - I still managed to run this with 16GB.

With modifications to the model loading process (removing .to(device)) and the addition of offload_buffers=True, you can run this even on systems with 16GB of RAM. Please note that while this allows for broader compatibility, you should expect significantly longer response times during testing. Basically useless for standard use.

model = Qwen2VLForConditionalGeneration.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto",
offload_buffers=True # Added
)

JerryKwan

Sep 6, 2024

Encountered the following error

RuntimeError: CUDA error: too many resources requested for launch
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

I am using two Tesla V100-SXM2-32GB GPU, Any one knows how many GPU it needs? thank you.

RArchered

Sep 9, 2024

not work on m3 pro while using mps. the error info:

RArchered

Sep 9, 2024

not work on m3 pro while using mps. the error info:

I solved it by upgrade torch to 2.4.0 : pip install torch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0

spawn99

Sep 9, 2024

Encountered the following error

RuntimeError: CUDA error: too many resources requested for launch
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

I am using two Tesla V100-SXM2-32GB GPU, Any one knows how many GPU it needs? thank you.

This sample is not with CUDA but for apple silicon

colin4k

Sep 11, 2024

Encountered the following error:
/AppleInternal/Library/BuildRoots/4ff29661-3588-11ef-9513-e2437461156c/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Types/MPSNDArray.mm:788: failed assertion `[MPSNDArray initWithDevice:descriptor:] Error: total bytes of NDArray > 2**32'
[1] 67229 abort python qwen2vl.py
I am using m2 max with 96g ram.

spawn99

Sep 11, 2024

Your image is too large and the tensor size becomes too large. Use a smaller one.

I've updated the gist to downscale an image if it's too large so this doesn't happen.
https://gist.github.com/cavit99/811919b3e7753c925ab603b1929dbd99

colin4k

Sep 13, 2024

Your image is too large and the tensor size becomes too large. Use a smaller one.

I've updated the gist to downscale an image if it's too large so this doesn't happen.
https://gist.github.com/cavit99/811919b3e7753c925ab603b1929dbd99

The new version is OK, thx!

satvikahuja

Sep 25, 2024

•

edited Sep 25, 2024

Here is a working example for use on apple silicon devices if it's of use to anyone.
https://gist.github.com/cavit99/811919b3e7753c925ab603b1929dbd99

To use the new 2B model replace 7B with 2B in the code.
Working quite well…

satvikahuja

Sep 25, 2024

•

edited Sep 25, 2024

You can run this on much smaller apple silicon hardware using 2B model with FP16,(also shortened the prompt), runs fast and smooth
Here's the modified code of

Here is a working example for use on apple silicon devices if it's of use to anyone.
https://gist.github.com/cavit99/811919b3e7753c925ab603b1929dbd99

In this repo-->https://github.com/satvikahuja/Easy-qwen2vlm2b-4macbook?tab=readme-ov-file

satvikahuja

Sep 25, 2024

This comment has been hidden

Maverick98

27 days ago

Facing the error : Some parameters are on the meta device because they were offloaded to the disk.
You shouldn't move a model that is dispatched using accelerate hooks.
Failed to load the model: You can't move a model that has some modules offloaded to cpu or disk.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment