neuralwork
/

gemma-2-9b-it-tr

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

gemma-2-9b-it-tr / README.md

adirik's picture

Update README.md

e4c540a verified 2 days ago

|

history blame contribute delete

3.41 kB

	---
	library_name: transformers
	license: gemma
	language:
	- tr
	base_model:
	- google/gemma-2-9b-it
	pipeline_tag: text-generation
	model-index:
	- name: neuralwork/gemma-2-9b-it-tr
	results:
	- task:
	type: multiple-choice
	dataset:
	type: multiple-choice
	name: MMLU_TR_V0.2
	metrics:
	- name: 5-shot
	type: 5-shot
	value: 0.6117
	verified: true
	- task:
	type: multiple-choice
	dataset:
	type: multiple-choice
	name: Truthful_QA_V0.2
	metrics:
	- name: 0-shot
	type: 0-shot
	value: 0.5583
	verified: true
	- task:
	type: multiple-choice
	dataset:
	type: multiple-choice
	name: ARC_TR_V0.2
	metrics:
	- name: 25-shot
	type: 25-shot
	value: 0.5640
	verified: true
	- task:
	type: multiple-choice
	dataset:
	type: multiple-choice
	name: HellaSwag_TR_V0.2
	metrics:
	- name: 10-shot
	type: 10-shot
	value: 0.5646
	verified: true
	- task:
	type: multiple-choice
	dataset:
	type: multiple-choice
	name: GSM8K_TR_V0.2
	metrics:
	- name: 5-shot
	type: 5-shot
	value: 0.6211
	verified: true
	- task:
	type: multiple-choice
	dataset:
	type: multiple-choice
	name: Winogrande_TR_V0.2
	metrics:
	- name: 5-shot
	type: 5-shot
	value: 0.6209
	verified: true
	---

	# Gemma-2-9b-it-tr

	Gemma-2-9b-it-tr is a finetuned version of [google/gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it) on a carefully curated and manually filtered dataset of 55k question answering and conversational samples in Turkish.


	## Training Details
	Base model: [google/gemma-2-9b-it](https://huggingface.co/google/gemma-2-9b-it)
	Training data: A filtered version of [metedb/turkish_llm_datasets](https://huggingface.co/datasets/metedb/turkish_llm_datasets/) and a small private dataset of 8k conversational samples on various topics.
	Training setup: We performed supervised fine tuning with LoRA with `rank=128` and `lora_alpha`=64. Training took 4 days on a single RTX 6000 Ada.

	Compared to the base model, we find Gemma-2-9b-tr has superior conversational and reasoning skills.

	## Usage
	You can load and use `neuralwork/gemma-2-9b-it-tr`as follows.

	```py
	import torch
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model = AutoModelForCausalLM.from_pretrained(
	"neuralwork/gemma-2-9b-it-tr",
	torch_dtype=torch.bfloat16,
	device_map="auto",
	trust_remote_code=True
	)

	tokenizer = AutoTokenizer.from_pretrained("neuralwork/gemma-2-9b-it-tr")

	messages = [
	{"role": "user", "content": "Python'da bir öğenin bir listede geçip geçmediğini nasıl kontrol edebilirim?"},
	]

	prompt = tokenizer.apply_chat_template(
	messages,
	tokenize=False,
	add_generation_prompt=True
	)

	outputs = model.generate(
	tokenizer(prompt, return_tensors="pt").input_ids.to(model.device),
	max_new_tokens=1024,
	do_sample=True,
	temperature=0.7,
	top_p=0.9
	)

	response = tokenizer.decode(outputs[0], skip_special_tokens=True)[len(prompt):]
	print(response)
	```