Gemma 3 270M Turkish Instructions Fine-tuned

This model is a fine-tuned version of Google Gemma 3 270M IT trained on a SoAp9035/turkish_instructions Dataset using direct fine-tuning.

Model Details

  • Base model: google/gemma-3-270m-it-qat-q4_0-unquantized
  • Fine-tune dataset: Turkish instruction-format dataset (SoAp9035/turkish_instructions Dataset) #Formatting Chat template for google/gemma-3-270m-it-qat-q4_0-unquantized
  • Fine-tune type: Direct fine-tuning (Causal LM)
  • Precision: Full precision / BF16 (BF16 used if GPU supports it)
  • Max token length: 256
  • Batch size: 2 (effective batch size = 8 with gradient accumulation)
  • Number of epochs: 2
  • Optimizer: AdamW
  • Scheduler: Cosine learning rate
  • Evaluation: Every 100 steps, best model selected based on eval_loss

Usage Example


import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

MODEL_NAME = "Dbmaxwell/gemma3-270m-turkish-instructions"

tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, trust_remote_code=True)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token
    tokenizer.pad_token_id = tokenizer.eos_token_id
tokenizer.padding_side = "right"

model = AutoModelForCausalLM.from_pretrained(MODEL_NAME)
device = "cuda" if torch.cuda.is_available() else "cpu"
model.to(device)
model.eval()

def generate_response(prompt, max_new_tokens=200):
    formatted_prompt = f"<bos><start_of_turn>user\n{prompt}<end_of_turn>\n<start_of_turn>model\n"
    inputs = tokenizer(formatted_prompt, return_tensors="pt").to(device)
    with torch.no_grad():
        outputs = model.generate(
            inputs.input_ids,
            max_new_tokens=max_new_tokens,
            temperature=0.3,
            top_p=0.8,
            do_sample=True,
            pad_token_id=tokenizer.pad_token_id,
            eos_token_id=tokenizer.eos_token_id,
            repetition_penalty=1.2,
            no_repeat_ngram_size=3,
        )
    response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
    return response.split("<end_of_turn>")[0].strip()

test_prompts = [
    "Merhaba! Ben bir AI asistanım. Sana nasıl yardımcı olabilirim?",  
    "Python'da for döngüsü nasıl yazılır?",
    "İstanbul Türkiye'nin en büyük şehridir. Kısa bilgi ver.",
    "Makine öğrenmesi nedir? Basit açıklama yap.",
    "5 artı 3 çarpı 2 kaçtır?",
    "Türkiye'nin başkenti neresidir?"
]

for i, prompt in enumerate(test_prompts, 1):
    print(f"\n{i} Question: {prompt}")
    print(f"Answer: {generate_response(prompt, max_new_tokens=100)}")
    print("-" * 60)
Downloads last month
101
Safetensors
Model size
268M params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Dbmaxwell/gemma3-270m-turkish-instructions

Finetuned
(2)
this model

Dataset used to train Dbmaxwell/gemma3-270m-turkish-instructions