Edit model card

Model Card for Model ID

This model card corresponds to the 7B instruct finetuned version of the Gemma model.

Model Details

This is a general question-answer model finetuned on the web_questions dataset.

Model Description

This is a general question-answer LLM finetuned using Gemma on top of web_questions dataset. Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. They are text-to-text, decoder-only large language models, available in English, with open weights, pre-trained variants, and instruction-tuned variants. Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as a laptop, desktop or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone.

  • Developed by: Geerath Bhat
  • Model type: Fine-tuned Instruct LLM.
  • Language(s) (NLP): English
  • License: No
  • Finetuned from model: [google/gemma-7b-it]

Usage

Google/Gemma has shared some code snippets on how to get quickly started with running the model. First make sure to pip install -U transformers, then copy the snippet from the section that is relevant for your usecase.

hf_model_repo = Geerath/google-gemma-7b-it-finetuned-web-questions

# Get the tokenizer
tokenizer = AutoTokenizer.from_pretrained(hf_model_repo)

# Load the model


model = AutoModelForCausalLM.from_pretrained(hf_model_repo,
                                             quantization_config=bnb_config,
                                             device_map="auto")

prompt = ["Question: Tell me something about IISc\n\nAnswer:\n"]

# Generate response
%%time
input_ids = tokenizer(prompt, return_tensors="pt", truncation=True).input_ids
outputs = model.generate(input_ids=input_ids,
                         max_new_tokens=200,
                         do_sample = True,
                         temperature=0.2)

result = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]

result = "Question:"+result.split("Question:")[1]

Print the result

print(f"Generated response:\n{result}")

Fine-tuning the model

You can find fine-tuning scripts and notebook under the examples/ directory of google/gemma-7b repository. To adapt it to this model, simply change the model-id to google/gemma-7b-it. In that repository, we provide:

  • A script to perform Supervised Fine-Tuning (SFT) on UltraChat dataset using QLoRA
  • A script to perform SFT using FSDP on TPU devices
  • A notebook that you can run on a free-tier Google Colab instance to perform SFT on English quotes dataset

How to Get Started with the Model

Use the code provided by google/gemma-7b-it to get started with this finetuned model.

Training Details

Training Data

web_questions

Training Procedure

Trained using SFTTrainer and below are the TrainingArguments.

num_train_epochs=1, # adjust based on the data size
per_device_train_batch_size=4, # use 2 or 4 if you have less GPU RAM
per_device_eval_batch_size=4,
optim="paged_adamw_32bit",
#gradient_accumulation_steps=2,
save_strategy="epoch", 
evaluation_strategy="epoch",
learning_rate=2e-4,
logging_steps=1,
fp16=True,
weight_decay=0.01,
lr_scheduler_type="cosine",
seed=42,

Evaluation

Evaluated on test set of the web_questions dataset.

Testing Data

Currently tested on test set of web_questions dataset and will update soon the testing results with respect to other datasets. Thank you!!!

Metrics

Perplexity Accuracy F1 Score

Results

After 2 epochs the training loss was 1.114500 and validation loss was 1.592121.

Perplexity on test data from web_questions dataset: 5.13

Downloads last month
14
Safetensors
Model size
8.54B params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Geerath/google-gemma-7b-it-finetuned-web-questions