Gemma-2-9b-it-tr
Gemma-2-9b-it-tr is a finetuned version of google/gemma-2-9b-it on a carefully curated and manually filtered dataset of 55k question answering and conversational samples in Turkish.
Training Details
Base model: google/gemma-2-9b-it
Training data: A filtered version of metedb/turkish_llm_datasets and a small private dataset of 8k conversational samples on various topics.
Training setup: We performed supervised fine tuning with LoRA with rank=128
and lora_alpha
=64. Training took 4 days on a single RTX 6000 Ada.
Compared to the base model, we find Gemma-2-9b-tr has superior conversational and reasoning skills.
Usage
You can load and use neuralwork/gemma-2-9b-it-tr
as follows.
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"neuralwork/gemma-2-9b-it-tr",
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained("neuralwork/gemma-2-9b-it-tr")
messages = [
{"role": "user", "content": "Python'da bir öğenin bir listede geçip geçmediğini nasıl kontrol edebilirim?"},
]
prompt = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
outputs = model.generate(
tokenizer(prompt, return_tensors="pt").input_ids.to(model.device),
max_new_tokens=1024,
do_sample=True,
temperature=0.7,
top_p=0.9
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)[len(prompt):]
print(response)
- Downloads last month
- 47
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for neuralwork/gemma-2-9b-it-tr
Evaluation results
- 5-shot on MMLU_TR_V0.2self-reported0.612
- 0-shot on Truthful_QA_V0.2self-reported0.558
- 25-shot on ARC_TR_V0.2self-reported0.564
- 10-shot on HellaSwag_TR_V0.2self-reported0.565
- 5-shot on GSM8K_TR_V0.2self-reported0.621
- 5-shot on Winogrande_TR_V0.2self-reported0.621