slow

by ehartford - opened Apr 8, 2024

MLX Community org Apr 8, 2024

I am trying to use this on m3 max.
It's so slow that it's unusable. (0.51 token/second)
Is it possible to make it any faster?

Can't compare it with llama.cpp because, it doesn't work there yet.

MLX Community org Apr 8, 2024

Yes, it's possible to make it faster. You can read more here:

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment