NeverSleep
/

MiquMaid-v1-70B-GGUF

GGUF

Not-For-All-Audiences

nsfw

Inference Endpoints

Model card Files Files and versions Community

Memory usage

by Ainonake - opened Feb 1

Discussion

Ainonake

Feb 1

Some information about memory usage.
q4_0 barely fits into 64 GB ram and 8 GB vram with 2k context, leaving only a couple hundred megabytes on my PC.
With this configuration, for a larger context window, I recommend using a smaller quant or rent a GPU server.

Ainonake

Feb 1

64 gb RAM + rtx 3060 ti 8gb VRAM, all layers offloaded to gpu in llama.cpp = 0.22 tokens/s

Ainonake

Feb 1

Overall, a very good model, especially surprising the possibilities in languages other than English (at least the original miqu). It also feels that the model has some "character", something similar to goliath. In writing, the model does not use the robotic language coming from gpt 4 and is more human-like.

biship

Feb 1

Yeah, unusable on a 4090 + 128GB. A 2bit quant will cripple this.
Looking forward to a 13B or 20B version.

Undi95

NeverSleep org Feb 1

Overall, a very good model, especially surprising the possibilities in languages other than English (at least the original miqu). It also feels that the model has some "character", something similar to goliath. In writing, the model does not use the robotic language coming from gpt 4 and is more human-like.

Thanks for the feedback!

Nexesenex

Feb 1

•

edited Feb 1

Yeah, unusable on a 4090 + 128GB. A 2bit quant will cripple this.
Looking forward to a 13B or 20B version.

A IQ2_XS version allows a full offload on 3090/4090 and won't cripple it too much (perplexity +0.7-0.9), and an IQ3_XXS will allow you to a 80-90% offload accordingly to your context size while having the output quality of a Q3_K_S

Erilaz

Feb 1

•

edited Feb 1

Yeah, unusable on a 4090 + 128GB. A 2bit quant will cripple this.
Looking forward to a 13B or 20B version.

That's only possible if Mistral has such a model and the will to release it.
Or they have another client with yet another overly enthusiastic employee ;D

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment