Transformers
mpt
Composer
MosaicML
llm-foundry
text-generation-inference
TheBloke commited on
Commit
846c988
1 Parent(s): 992655b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -0
README.md CHANGED
@@ -20,6 +20,30 @@ This repo is the result of converting to GGML and quantising.
20
  * [MPT-7B: 4-bit, 5-bit and 8-bit GGML models for CPU (+CUDA) inference](https://huggingface.co/TheBloke/MPT-7B).
21
  * [MPT-7B-Instruct: 4-bit, 5-bit and 8-bit GGML models for CPU (+CUDA) inference](https://huggingface.co/TheBloke/MPT-7B-Instruct).
22
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
  ## Provided files
24
  | Name | Quant method | Bits | Size | RAM required | Use case |
25
  | ---- | ---- | ---- | ---- | ---- | ----- |
 
20
  * [MPT-7B: 4-bit, 5-bit and 8-bit GGML models for CPU (+CUDA) inference](https://huggingface.co/TheBloke/MPT-7B).
21
  * [MPT-7B-Instruct: 4-bit, 5-bit and 8-bit GGML models for CPU (+CUDA) inference](https://huggingface.co/TheBloke/MPT-7B-Instruct).
22
 
23
+ ## Compatibilty
24
+
25
+ These files are **not** compatible with llama.cpp.
26
+
27
+ Currently they can be used with:
28
+ * The example `mpt` binary provided with [ggml](https://github.com/ggerganov/ggml)
29
+ * [rustformers' llm](https://github.com/rustformers/llm)
30
+
31
+ As other options become available I will endeavour to update them here (do let me know in the Community tab if I've missed something!)
32
+
33
+ ## How to build, and an example of using the ggml `mpt` binary (command line only):
34
+
35
+ ```
36
+ git clone https://github.com/ggerganov/ggml
37
+ cd ggml
38
+ mkdir build
39
+ cd build
40
+ cmake ..
41
+ cmake --build . --config Release
42
+ bin/mpt -m /path/to/mpt7b-instructggmlv2.ggmlv2.q4_0.bin -t 8 -n 512 -p "Write a story about llamas"
43
+ ```
44
+
45
+ Please see the ggml repo for other build options.
46
+
47
  ## Provided files
48
  | Name | Quant method | Bits | Size | RAM required | Use case |
49
  | ---- | ---- | ---- | ---- | ---- | ----- |