dranger003
/

c4ai-command-r-plus-iMat.GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

dranger003 commited on Apr 5

Commit

32cc406

•

1 Parent(s): 234e584

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -7,6 +7,8 @@ base_model: CohereForAI/c4ai-command-r-plus
 **2024-04-05**: Support for this model is still being worked on - [`PR#6491`](https://github.com/ggerganov/llama.cpp/pull/6491).
 For now, you can test the model using this fork: [https://github.com/Noeda/llama.cpp/tree/commandr-plus](https://github.com/Noeda/llama.cpp/tree/commandr-plus)
 * GGUF importance matrix (imatrix) quants for https://huggingface.co/CohereForAI/c4ai-command-r-plus
 * The importance matrix was trained for ~100K tokens (200 batches of 512 tokens) using [wiki.train.raw](https://huggingface.co/datasets/wikitext).
 * [Which GGUF is right for me? (from Artefact2)](https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9)

 **2024-04-05**: Support for this model is still being worked on - [`PR#6491`](https://github.com/ggerganov/llama.cpp/pull/6491).
 For now, you can test the model using this fork: [https://github.com/Noeda/llama.cpp/tree/commandr-plus](https://github.com/Noeda/llama.cpp/tree/commandr-plus)
+**<u>NOTE</u>**: There may be a need to re-quantize all the weights once support for this model is finalized.
 * GGUF importance matrix (imatrix) quants for https://huggingface.co/CohereForAI/c4ai-command-r-plus
 * The importance matrix was trained for ~100K tokens (200 batches of 512 tokens) using [wiki.train.raw](https://huggingface.co/datasets/wikitext).
 * [Which GGUF is right for me? (from Artefact2)](https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9)