--- library_name: transformers license: gemma language: - tr base_model: - google/gemma-2-9b-it pipeline_tag: text-generation model-index: - name: neuralwork/gemma-2-9b-it-tr results: - task: type: multiple-choice dataset: type: multiple-choice name: MMLU_TR_V0.2 metrics: - name: 5-shot type: 5-shot value: 0.6117 verified: true - task: type: multiple-choice dataset: type: multiple-choice name: Truthful_QA_V0.2 metrics: - name: 0-shot type: 0-shot value: 0.5583 verified: true - task: type: multiple-choice dataset: type: multiple-choice name: ARC_TR_V0.2 metrics: - name: 25-shot type: 25-shot value: 0.5640 verified: true - task: type: multiple-choice dataset: type: multiple-choice name: HellaSwag_TR_V0.2 metrics: - name: 10-shot type: 10-shot value: 0.5646 verified: true - task: type: multiple-choice dataset: type: multiple-choice name: GSM8K_TR_V0.2 metrics: - name: 5-shot type: 5-shot value: 0.6211 verified: true - task: type: multiple-choice dataset: type: multiple-choice name: Winogrande_TR_V0.2 metrics: - name: 5-shot type: 5-shot value: 0.6209 verified: true --- # Gemma-2-9b-it-tr Gemma-2-9b-it-tr is a finetuned version of [google/gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it) on a carefully curated and manually filtered dataset of 55k question answering and conversational samples in Turkish. ## Training Details **Base model:** [google/gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it) **Training data:** A filtered version of [metedb/turkish_llm_datasets](https://huggingface.co/datasets/metedb/turkish_llm_datasets/) and a small private dataset of 8k conversational samples on various topics. **Training setup:** We performed supervised fine tuning with LoRA with `rank=128` and `lora_alpha`=64. Training took 4 days on a single RTX 6000 Ada. Compared to the base model, we find Gemma-2-9b-tr has superior conversational and reasoning skills. ## Usage You can load and use `neuralwork/gemma-2-9b-it-tr`as follows. ```py import torch from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained( "neuralwork/gemma-2-9b-it-tr", torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True ) tokenizer = AutoTokenizer.from_pretrained("neuralwork/gemma-2-9b-it-tr") messages = [ {"role": "user", "content": "Python'da bir öğenin bir listede geçip geçmediğini nasıl kontrol edebilirim?"}, ] prompt = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) outputs = model.generate( tokenizer(prompt, return_tensors="pt").input_ids.to(model.device), max_new_tokens=1024, do_sample=True, temperature=0.7, top_p=0.9 ) response = tokenizer.decode(outputs[0], skip_special_tokens=True)[len(prompt):] print(response) ```