Edit model card

Summary

This is a 4bit quantised openlm-research/open_llama_13b using GPTQ-for-LLaMa.

The quantization command was: python ./GPTQ-for-LLaMa/llama.py ./open_llama_13b c4 --wbits 4 --true-sequential --groupsize 128 --save open-llama-13b-4bit-128g.pt

Original model readme is below.

OpenLLaMA: An Open Reproduction of LLaMA

In this repo, we present a permissively licensed open source reproduction of Meta AI's LLaMA large language model. We are releasing 3B, 7B and 13B models trained on 1T tokens. We provide PyTorch and JAX weights of pre-trained OpenLLaMA models, as well as evaluation results and comparison against the original LLaMA models. Please see the project homepage of OpenLLaMA for more details. (continue at https://huggingface.co/openlm-research/open_llama_13b)

Downloads last month
12
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Gustrd/open-llama-13b-4bit-128g-GPTQ

Adapters
1 model