YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
This repo contains serialized blobs of an up projection layer of llama3-8B (oc=14336, ic=4096). The linear layer has been quantized (GPTQ W4 Sym with group size 32) and sparsified by 50%.
βββ sparse_w4
β βββ linear_bitmap_int32.bin
β βββ linear_compressed_qweight_int32.bin
β βββ linear_nnz_int16.bin
β βββ linear_scales_float16.bin
β βββ linear_zeros_int32.bin
Usage
The following script shows how to process the blobs in python. It shows unpacking, zero location recovery, as well as weight dequantization process.
python unpack_blobs.py
you can ignore
internal/