allura-org/MoE-Girl_400MA_1BT

#390
by Fizzarolli - opened

hai ! back with another club banger of a miniature moe model for phones n stuff
https://huggingface.com/allura-org/MoE-Girl_400MA_1BT

The last time somebody requested something for his phone or so was a 1700B model.

Uhm.. :) It's queued and should be done in no time :)

mradermacher changed discussion status to closed

Unfortunately, the model is broken (contains nans):

nan detected in blk.0.attn_output.weight

The static quants have been generated, but will be of limited value.

hmm... that's odd; vllm and assorted inference software work fine, i'll have to look into it
thx for trying though! <3

it's possible that the damage stays contained to only a part of the model, but a nan means that any computation based on that weight will result in more nans, trickling through the model. a nan is basically an error value. most inferencing frameworks do not check, including llama when inferencing, so it is possible that the static quants work, to some extent, unless model verification is enabled, at which it will refuse to load.

i know, i'm a programmer too xD

image.png

i'm not big on lcpp code, but assuming that's referring to something in the safetensors file... that doesn't even exist? is that lcpp's name for o_proj?

Yes, that's the llama name for its tensor. I have no clue how it maps those, let me grep around...

blk.{bid}.attn_output "model.layers.{bid}.self_attn.o_proj", # llama-hf nemotron olmoe

Yeah, pretty much looks like it.

Sign up or log in to comment