metadata
language:
- en
- fa
Model description
Example output:
Example 1:
- Input: "سلام، خوبی؟"
- Output: "سلام، خوشحالم که با شما صحبت می کنم. چطور می توانم به شما کمک کنم؟"
Example 2:
- Input: "سلام، خوبی؟"
- Output: "سلام، خوشحالم که با شما صحبت می کنم. چطور می توانم به شما کمک کنم؟"
Banchmark results
model |
dataset |
max_token |
prompt |
score |
base-model-7b |
ARC-easy-dev |
2 |
en-1 |
0.41929 |
base-model-7b |
ARC-easy-dev |
80 |
en-2 |
0.39122 |
base-model-7b |
ARC-easy-dev |
300 |
en-1 |
0.34448 |
model |
dataset |
max_token |
prompt |
score |
fa-model-7b |
ARC-easy-dev |
80 |
en-1 |
0.37894 |
fa-model-7b |
ARC-easy-dev |
80 |
en-2 |
0.33333 |
fa-model-7b |
ARC-easy-dev |
80 |
fa-2 |
0.28771 |
fa-model-7b |
ARC-easy-dev |
300 |
fa-1 |
0.25752 |
fa-model-7b |
ARC-easy-dev |
2 |
fa-1 |
0.24035 |
model |
dataset |
max_token |
prompt |
score |
base-model-7b |
ARC-challenge-dev |
80 |
en-2 |
0.37123 |
base-model-7b |
ARC-challenge-dev |
2 |
en-2 |
0.36789 |
base-model-7b |
ARC-challenge-dev |
2 |
en-1 |
0.35451 |
base-model-7b |
ARC-challenge-dev |
80 |
en-1 |
0.33779 |
model |
dataset |
max_token |
prompt |
score |
fa-model-7b |
ARC-challenge-dev |
2 |
en-1 |
0.39298 |
fa-model-7b |
ARC-challenge-dev |
80 |
en-1 |
0.38421 |
fa-model-7b |
ARC-challenge-dev |
2 |
en-2 |
0.31929 |
fa-model-7b |
ARC-challenge-dev |
80 |
en-2 |
0.31754 |
How to use
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("aidal/Persian-Mistral-7B")
model = AutoModelForCausalLM.from_pretrained("aidal/Persian-Mistral-7B")
input_text = "پایتخت ایران کجاست؟"
input_ids = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**input_ids)
print(tokenizer.decode(outputs[0]))
Training and finetuning