A Mistral-7B pruned50 with Marlin Kernel and AutoGPTQ
Please see my tutorial to execute this model:
https://vilsonrodrigues.medium.com/sparse-quantize-and-serving-llms-with-neuralmagic-autogptq-and-vllm-03961b72ec3a
Chat template
Files info