Gemma 3 270M Turkish Instructions Fine-tuned
This model is a fine-tuned version of Google Gemma 3 270M IT trained on a SoAp9035/turkish_instructions Dataset using direct fine-tuning.
Model Details
- Base model:
google/gemma-3-270m-it-qat-q4_0-unquantized
- Fine-tune dataset: Turkish instruction-format dataset (
SoAp9035/turkish_instructions Dataset
) #Formatting Chat template for google/gemma-3-270m-it-qat-q4_0-unquantized - Fine-tune type: Direct fine-tuning (Causal LM)
- Precision: Full precision / BF16 (BF16 used if GPU supports it)
- Max token length: 256
- Batch size: 2 (effective batch size = 8 with gradient accumulation)
- Number of epochs: 2
- Optimizer: AdamW
- Scheduler: Cosine learning rate
- Evaluation: Every 100 steps, best model selected based on
eval_loss
Usage Example
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
MODEL_NAME = "Dbmaxwell/gemma3-270m-turkish-instructions"
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, trust_remote_code=True)
if tokenizer.pad_token is None:
tokenizer.pad_token = tokenizer.eos_token
tokenizer.pad_token_id = tokenizer.eos_token_id
tokenizer.padding_side = "right"
model = AutoModelForCausalLM.from_pretrained(MODEL_NAME)
device = "cuda" if torch.cuda.is_available() else "cpu"
model.to(device)
model.eval()
def generate_response(prompt, max_new_tokens=200):
formatted_prompt = f"<bos><start_of_turn>user\n{prompt}<end_of_turn>\n<start_of_turn>model\n"
inputs = tokenizer(formatted_prompt, return_tensors="pt").to(device)
with torch.no_grad():
outputs = model.generate(
inputs.input_ids,
max_new_tokens=max_new_tokens,
temperature=0.3,
top_p=0.8,
do_sample=True,
pad_token_id=tokenizer.pad_token_id,
eos_token_id=tokenizer.eos_token_id,
repetition_penalty=1.2,
no_repeat_ngram_size=3,
)
response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
return response.split("<end_of_turn>")[0].strip()
test_prompts = [
"Merhaba! Ben bir AI asistanım. Sana nasıl yardımcı olabilirim?",
"Python'da for döngüsü nasıl yazılır?",
"İstanbul Türkiye'nin en büyük şehridir. Kısa bilgi ver.",
"Makine öğrenmesi nedir? Basit açıklama yap.",
"5 artı 3 çarpı 2 kaçtır?",
"Türkiye'nin başkenti neresidir?"
]
for i, prompt in enumerate(test_prompts, 1):
print(f"\n{i} Question: {prompt}")
print(f"Answer: {generate_response(prompt, max_new_tokens=100)}")
print("-" * 60)
- Downloads last month
- 101
Model tree for Dbmaxwell/gemma3-270m-turkish-instructions
Base model
google/gemma-3-270m
Finetuned
google/gemma-3-270m-it