How to use this model with onnxruntime-genai

by hmartinez82 - opened Apr 29, 2024

Discussion

hmartinez82

Apr 29, 2024

Is it possible to use this model with onnxruntime-genai (model-qa.py).

If so, how?

aless2212

Owner Apr 29, 2024

If the model architecture is supported . Then I guess there should not be a problem with that.
Most probably if the last release was a bit old. You might need to build the latest upstream code from GitHub and try up with it .

aless2212

Owner Apr 29, 2024

I do build . Onnxruntime for specifically ROCm for my purpose . The Onnx Runtime is base package. The extensions seem like additional wrappers .
Do use a nightly or build from source.
This might be of help.
https://onnxruntime.ai/docs/genai/tutorials/phi2-python.html

hmartinez82

Apr 29, 2024

That's the tutorial I was following :) But to no avail.
I guess I'll try building it locally or using the nightly.

Thank you.

aless2212

Owner Apr 29, 2024

On my test with just onnx runtime . And loading onnx model using huggingface api from optimum works and generates results. Kindly check if it’s because lack of fp16 support on your specific hardware accelerator or memory bottleneck for loading a big model.

aless2212

Owner Apr 29, 2024

It does support directml and CUDA backend at this moment.
I’ll probably add support for ROCm later .
The cpu code is very generic c/cpp. As I don’t see some sort of provider usage and neural network framework.
You can try with some other models and tell me if there’s something wrong in it.

https://github.com/microsoft/onnxruntime-genai/blob/main/src/sequences_cuda.cpp

hmartinez82

May 8, 2024

Yes, the lack of FP16 is getting in the way :(. I hope that the Snapdragon Elite X NPU will support it.

hmartinez82 changed discussion status to closed May 8, 2024

aless2212

Owner May 8, 2024

•

edited May 8, 2024

I Hope our great Engineers could solve your problem, As in the Supreme Leader we believe in .

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment