acrastt's Marx 3B GGML

These files are GGML format model files for acrastt's Marx 3B GGML.

GGML files are for CPU + GPU inference using llama.cpp and libraries and UIs which support this format, such as:

How to run in `llama.cpp`

I use the following command line; adjust for your tastes and needs:

./main -t 8 -ngl 26 -m Marx-3B.ggmlv3.q4_0.bin --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "prompt goes here"

Change -t 8 to the number of physical CPU cores you have. For example if your system has 8 cores/16 threads, use -t 8.

Change -ngl 26 to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration.

If you want to have a chat-style conversation, replace the -p <PROMPT> argument with -i -ins, you can use --interactive-first to start in interactive mode.

Compatibility

I have uploded llama.cpp quant methods (q4_0, q4_1, q5_0, q5_1, q8_0).

Please refer to llama.cpp and TheBloke's GGML models for further explanation.

How to run in `text-generation-webui`

Further instructions here: text-generation-webui/docs/llama.cpp-models.md.

Thanks

Thanks to TheBloke for inspiration and providing almost all of the readme here!

Thanks to acrastt for providing checkpoints of the model.

Thanks to Georgi Gerganov and all of the awesome people in the AI community.

asedmammad
/

Marx-3B-GGML

acrastt's Marx 3B GGML

How to run in `llama.cpp`

Compatibility

How to run in `text-generation-webui`

Thanks

Dataset used to train asedmammad/Marx-3B-GGML

acrastt's Marx 3B GGML

How to run in llama.cpp

Compatibility

How to run in text-generation-webui

Thanks

Dataset used to train asedmammad/Marx-3B-GGML

How to run in `llama.cpp`

How to run in `text-generation-webui`