To the few who can run it, Post speeds, setup, results quality.

by alphaprime90 - opened Feb 13

Discussion

alphaprime90

Feb 13

To the few who can run it, Post speeds, setup, results quality.

ehartford

Feb 13

I use the 4-bit version on 128 gb MacBook m3 Max. all the samples in the model card come from that.

Iommed

Feb 13

How well does the q_4 version compare to Miqu 120b?

phi0112358

Feb 13

How many tokens per second could you reach with your Macbook in this way? @ehartford

ehartford

Feb 13

This is a very different model. Not really comparable.

How well does the q_4 version compare to Miqu 120b?

alphaprime90

Feb 13

I use the 4-bit version on 128 gb MacBook m3 Max. all the samples in the model card come from that.

Hi Eric, how is your t/s? dare I say sup 1 t/s

shaowlnkngfu

Feb 15

I use the 4-bit version on 128 gb MacBook m3 Max. all the samples in the model card come from that.

Hi Eric, how is your t/s? dare I say sup 1 t/s

I quantized the OG model myself so its q4_k_m but here is what I am getting on a Mac studio/M1/64core gpu/128gb ram

total duration: 2m30.232853459s
load duration: 1.101417ms
prompt eval count: 23 token(s)
prompt eval duration: 3.85487s
prompt eval rate: 5.97 tokens/s
eval count: 720 token(s)
eval duration: 2m26.375722s
eval rate: 4.92 tokens/s

MauL93

Apr 21

Hi, How can I merge the aa and ab files?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment