neuralmagic
/

Llama-2-7b-chat-quantized.w4a16

Text Generation

text-generation-inference

4-bit precision

Model card Files Files and versions Community

Llama-2-7b-chat-quantized.w4a16

2 contributors

History: 6 commits

abhinavnmagic's picture

Upload tokenizer.json with huggingface_hub

bcdfa49 verified 9 months ago

.gitattributes

1.52 kB

initial commit 9 months ago
config.json

1.03 kB

Upload config.json with huggingface_hub 9 months ago
model.safetensors

3.89 GB
LFS

Upload model.safetensors with huggingface_hub 9 months ago
quantize_config.json

269 Bytes

Upload quantize_config.json with huggingface_hub 9 months ago
special_tokens_map.json

414 Bytes

Upload special_tokens_map.json with huggingface_hub 9 months ago
tokenizer.json

1.84 MB

Upload tokenizer.json with huggingface_hub 9 months ago