namtran
/

LLaMA-7b-AWQ-GGUF

Model card Files Files and versions Community

namtran commited on Dec 23, 2023

Commit

6d0c035

•

1 Parent(s): 992cd0c

Update README.md

Files changed (1) hide show

README.md +13 -10

README.md CHANGED Viewed

@@ -1,15 +1,18 @@
-# LLaMA 7B - AWQ GGUF
 - Model creator: [Meta](https://huggingface.co/none)
 - Original model: [LLaMA 7B](https://ai.meta.com/blog/large-language-model-llama-meta-ai)
-## Description
-This repo contains GGUF format model files with AWQ quantizaton support for [Meta's LLaMA 7b](https://ai.meta.com/blog/large-language-model-llama-meta-ai).
-## About AWQ-GGUF
-The model was convert by the combination of [llama.cpp](https://github.com/ggerganov/llama.cpp) and quantization method [AWQ](https://github.com/mit-han-lab/llm-awq)
-## How to use models
-Please refer to [the PR](https://github.com/ggerganov/llama.cpp/pull/4593) from llama.cpp for more detail.

+---
+inference: false
+license: other
+model_type: llama
+---
+# Meta's LLaMA 7B - AWQ GGUF
+These files are in GGUF format.
 - Model creator: [Meta](https://huggingface.co/none)
 - Original model: [LLaMA 7B](https://ai.meta.com/blog/large-language-model-llama-meta-ai)
+The model was converted by the combination of [llama.cpp](https://github.com/ggerganov/llama.cpp) and quantization method [AWQ](https://github.com/mit-han-lab/llm-awq)
+## How to use models in `llama.cpp`
+```
+./main -m ggml-model-q4_0-awq.gguf -n 128 --prompt "Once upon a time"
+```
+Please the the instruction at the [PR](https://github.com/ggerganov/llama.cpp/pull/4593)