|
--- |
|
license: bsd |
|
language: |
|
- fa |
|
tags: |
|
- llama |
|
- llama.cpp |
|
- 7B |
|
- Alpaca |
|
- Quantize |
|
--- |
|
# Model Card for Model ID |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
## How to run in `llama.cpp` |
|
|
|
|
|
``` |
|
./main -t 10 -ngl 32 -m persian_llama_7b.Q4_K_M.gguf --color -c 2048 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "### Instruction: یک شعر حماسی در مورد کوه دماوند بگو ### Input: ### Response:" |
|
``` |
|
Change `-t 10` to the number of physical CPU cores you have. For example if your system has 8 cores/16 threads, use `-t 8`. |
|
|
|
Change `-ngl 32` to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration. |
|
|
|
Tto have a chat-style conversation, replace the `-p <PROMPT>` argument with `-i -ins` |
|
|
|
## How to run in `text-generation-webui` |
|
|
|
Further instructions here: [text-generation-webui/docs/llama.cpp-models.md](https://github.com/oobabooga/text-generation-webui/blob/main/docs/llama.cpp-models.md). |
|
|
|
## How to run using `LangChain` |
|
|
|
##### Instalation on CPU |
|
``` |
|
pip install llama-cpp-python |
|
``` |
|
##### Instalation on GPU |
|
``` |
|
CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python |
|
``` |
|
|
|
```python |
|
from langchain.llms import LlamaCpp |
|
from langchain import PromptTemplate, LLMChain |
|
from langchain.callbacks.manager import CallbackManager |
|
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler |
|
|
|
n_gpu_layers = 40 # Change this value based on your model and your GPU VRAM pool. |
|
n_batch = 512 # Should be between 1 and n_ctx, consider the amount of VRAM in your GPU. |
|
n_ctx=2048 |
|
|
|
callback_manager = CallbackManager([StreamingStdOutCallbackHandler()]) |
|
|
|
# Make sure the model path is correct for your system! |
|
llm = LlamaCpp( |
|
model_path="./persian_llama_7b.Q4_K_M.gguf", |
|
n_gpu_layers=n_gpu_layers, n_batch=n_batch, |
|
callback_manager=callback_manager, |
|
verbose=True, |
|
n_ctx=n_ctx |
|
) |
|
|
|
llm("""### Instruction: |
|
یک شعر حماسی در مورد کوه دماوند بگو |
|
|
|
### Input: |
|
|
|
### Response:""") |
|
``` |
|
For more information refer [LangChain](https://python.langchain.com/docs/modules/model_io/models/llms/integrations/llamacpp) |
|
|
|
|
|
|