allura-org/MoE-Girl_400MA_1BT

#390

by Fizzarolli - opened Oct 23

Oct 23

hai ! back with another club banger of a miniature moe model for phones n stuff
https://huggingface.com/allura-org/MoE-Girl_400MA_1BT

mradermacher

Owner Oct 23

The last time somebody requested something for his phone or so was a 1700B model.

Uhm.. :) It's queued and should be done in no time :)

mradermacher changed discussion status to closed Oct 23

mradermacher

Owner Oct 23

Unfortunately, the model is broken (contains nans):

nan detected in blk.0.attn_output.weight

The static quants have been generated, but will be of limited value.

Fizzarolli

Oct 23

hmm... that's odd; vllm and assorted inference software work fine, i'll have to look into it
thx for trying though! <3

mradermacher

Owner Oct 23

•

edited Oct 23

it's possible that the damage stays contained to only a part of the model, but a nan means that any computation based on that weight will result in more nans, trickling through the model. a nan is basically an error value. most inferencing frameworks do not check, including llama when inferencing, so it is possible that the static quants work, to some extent, unless model verification is enabled, at which it will refuse to load.

Fizzarolli

Oct 23

i know, i'm a programmer too xD

i'm not big on lcpp code, but assuming that's referring to something in the safetensors file... that doesn't even exist? is that lcpp's name for o_proj?

mradermacher

Owner Oct 23

Yes, that's the llama name for its tensor. I have no clue how it maps those, let me grep around...

blk.{bid}.attn_output "model.layers.{bid}.self_attn.o_proj", # llama-hf nemotron olmoe

Yeah, pretty much looks like it.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment