Edit model card

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Model Details

The Dorna models are a family of decoder-only models, specifically trained/fine-tuned on Persian data, developed by Part AI. As an initial release, an 8B instruct model from this family is Dorna-Llama3-8B-Instruct is built using the Meta Llama 3 Instruct model.

In this repo, we provide bf16 model and quantized models in the GGUF formats, including Q2_K, Q3_K, Q3_K_L, Q3_K_M, Q3_K_S, Q4_0, Q4_1, Q4_K_M, Q4_K_S, Q5_0, Q5_1, Q5_K_M, Q5_K_S and Q8_0

Here offers an in-depth report that includes several performance charts. Check it out.

Name Quant Method Bits Memory
dorna-llama3-8b-instruct.Q2_K.gguf Q2_K 2 3.2 GB
dorna-llama3-8b-instruct.Q3_K_L.gguf Q3_K_L 3 4.3 GB
dorna-llama3-8b-instruct.Q3_K_M.gguf Q3_K_M 3 4.1 GB
dorna-llama3-8b-instruct.Q3_K_S.gguf Q3_K_S 3 3.7 GB
dorna-llama3-8b-instruct.Q4_0.gguf Q4_1 4 4.7 GB
dorna-llama3-8b-instruct.Q4_1.gguf Q4_1 4 5.2 GB
dorna-llama3-8b-instruct.Q4_K_M.gguf Q4_K_M 4 4.9 GB
dorna-llama3-8b-instruct.Q4_K_S.gguf Q4_K_S 4 4.7 GB
dorna-llama3-8b-instruct.Q5_0.gguf Q5_0 5 5.6 GB
dorna-llama3-8b-instruct.Q5_1.gguf Q5_1 5 6.1 GB
dorna-llama3-8b-instruct.Q5_K_M.gguf Q5_K_M 5 5.73 GB
dorna-llama3-8b-instruct.Q5_K_S.gguf Q5_K_S 5 5.6 GB
dorna-llama3-8b-instruct.Q6_K.gguf Q6_K 6 6.6 GB
dorna-llama3-8b-instruct.Q8_0.gguf Recommended Q8_0 8 8.5 GB
dorna-llama3-8b-instruct.bf16.gguf None 16 16.2 GB

Requirements

We recommend using the Python version of llama.cpp and installing it with the following command:

!pip install https://github.com/abetlen/llama-cpp-python/releases/download/v0.2.78/llama_cpp_python-0.2.78-cp310-cp310-linux_x86_64.whl

How to use

Instead of cloning the repository, which may be inefficient, you can manually download the required GGUF file or use huggingface-cli (pip install huggingface_hub) as demonstrated below:

!huggingface-cli login --token $HUGGING_FACE_HUB_TOKEN
!huggingface-cli download PartAI/Dorna-Llama3-8B-Instruct-GGUF dorna-llama3-8b-instruct.Q8_0.gguf --local-dir . --local-dir-use-symlinks False
from llama_cpp import Llama

llm = Llama(
      model_path="dorna-llama3-8b-instruct.Q8_0.gguf",
      chat_format="llama-3",
      n_gpu_layers=-1,
      n_ctx=2048,

)

messages = [
    {"role": "system", "content": "You are a helpful Persian assistant. Please answer questions in the asked language."},
    {"role": "user", "content": "کاغذ A4 بزرگ تر است یا A5؟"},
]
result = llm.create_chat_completion(
    messages = messages,
    top_p=0.85,
    temperature=0.1

)

print(result)

Contact us

If you have any questions regarding this model, you can reach us via the community on Hugging Face.

Downloads last month
150
GGUF
Model size
8.03B params
Architecture
llama

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference API
Unable to determine this model’s pipeline type. Check the docs .

Space using PartAI/Dorna-Llama3-8B-Instruct-GGUF 1