Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
Xenova 
posted an update May 9
Post
11432
Introducing Phi-3 WebGPU, a private and powerful AI chatbot that runs 100% locally in your browser, powered by 🤗 Transformers.js and onnxruntime-web!

🔒 On-device inference: no data sent to a server
⚡️ WebGPU-accelerated (> 20 t/s)
📥 Model downloaded once and cached

Try it out: Xenova/experimental-phi3-webgpu

Amazing!! Shall we make a VB node for this?

·

The model might be a bit large, but it could be something to try!

This is so cool!

how do u obtain the wasm file? Didn't find it here: https://cdn.jsdelivr.net/npm/@xenova/transformers@2.17.1/dist/

cc: @Xenova

This is really cool! Performance is really good. I am running this on Chrome and Chrome unstable on Arch Linux with a RTX 3050 with 4GB ram on a Dell XPS 17. Unfortunately, inference starts super fast, but after a few sentences, I get what looks like a vulkan memory error:

vkAllocateMemory failed with VK_ERROR_OUT_OF_DEVICE_MEMORY

From that point on, the streaming only returns garbage. Will investigate further and see if I can get this running without crashing.