compressa-ai
/

Llama-3-8B-Instruct-OmniQuant

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

Llama-3-8B-Instruct-OmniQuant

2 contributors

History: 7 commits

Vasily Alexeev

add two stop toks in gen config

5413035 7 months ago

.gitattributes

1.52 kB

initial commit 7 months ago
README.md

6.96 kB

add asymm quantized model, add two eos in code sample 7 months ago
compressa-config.json

732 Bytes

add asymm quantized model, add two eos in code sample 7 months ago
config.json

898 Bytes

add asymm quantized model, add two eos in code sample 7 months ago
generation_config.json

131 Bytes

add two stop toks in gen config 7 months ago
model-00001-of-00002.safetensors

4.68 GB
LFS

add asymm quantized model, add two eos in code sample 7 months ago
model-00002-of-00002.safetensors

1.05 GB
LFS

add model weights and stuff 7 months ago
model.safetensors.index.json

78.5 kB

add model weights and stuff 7 months ago
quant_config.json

64 Bytes

add asymm quantized model, add two eos in code sample 7 months ago
special_tokens_map.json

301 Bytes

add model weights and stuff 7 months ago
tokenizer.json

9.08 MB

add model weights and stuff 7 months ago
tokenizer_config.json

51.4 kB

kinda fix eos token to stop model from chatting with itself 7 months ago