Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

This repo contains serialized blobs of an up projection layer of llama3-8B (oc=14336, ic=4096). The linear layer has been quantized (GPTQ W4 Sym with group size 32) and sparsified by 50%.

β”œβ”€β”€ sparse_w4
β”‚   β”œβ”€β”€ linear_bitmap_int32.bin
β”‚   β”œβ”€β”€ linear_compressed_qweight_int32.bin
β”‚   β”œβ”€β”€ linear_nnz_int16.bin
β”‚   β”œβ”€β”€ linear_scales_float16.bin
β”‚   └── linear_zeros_int32.bin

Usage

The following script shows how to process the blobs in python. It shows unpacking, zero location recovery, as well as weight dequantization process.

python unpack_blobs.py

you can ignore internal/

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .