Model Card for Model ID
This model card corresponds to the 7B instruct finetuned version of the Gemma model.
Model Details
This is a general question-answer model finetuned on the web_questions dataset.
Model Description
This is a general question-answer LLM finetuned using Gemma on top of web_questions dataset. Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. They are text-to-text, decoder-only large language models, available in English, with open weights, pre-trained variants, and instruction-tuned variants. Gemma models are well-suited for a variety of text generation tasks, including question answering, summarization, and reasoning. Their relatively small size makes it possible to deploy them in environments with limited resources such as a laptop, desktop or your own cloud infrastructure, democratizing access to state of the art AI models and helping foster innovation for everyone.
- Developed by: Geerath Bhat
- Model type: Fine-tuned Instruct LLM.
- Language(s) (NLP): English
- License: No
- Finetuned from model: [google/gemma-7b-it]
Usage
Google/Gemma has shared some code snippets on how to get quickly started with running the model. First make sure to pip install -U transformers
, then copy the snippet from the section that is relevant for your usecase.
hf_model_repo = Geerath/google-gemma-7b-it-finetuned-web-questions
# Get the tokenizer
tokenizer = AutoTokenizer.from_pretrained(hf_model_repo)
# Load the model
model = AutoModelForCausalLM.from_pretrained(hf_model_repo,
quantization_config=bnb_config,
device_map="auto")
prompt = ["Question: Tell me something about IISc\n\nAnswer:\n"]
# Generate response
%%time
input_ids = tokenizer(prompt, return_tensors="pt", truncation=True).input_ids
outputs = model.generate(input_ids=input_ids,
max_new_tokens=200,
do_sample = True,
temperature=0.2)
result = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
result = "Question:"+result.split("Question:")[1]
Print the result
print(f"Generated response:\n{result}")
Fine-tuning the model
You can find fine-tuning scripts and notebook under the examples/
directory of google/gemma-7b
repository. To adapt it to this model, simply change the model-id to google/gemma-7b-it
.
In that repository, we provide:
- A script to perform Supervised Fine-Tuning (SFT) on UltraChat dataset using QLoRA
- A script to perform SFT using FSDP on TPU devices
- A notebook that you can run on a free-tier Google Colab instance to perform SFT on English quotes dataset
How to Get Started with the Model
Use the code provided by google/gemma-7b-it to get started with this finetuned model.
Training Details
Training Data
web_questions
Training Procedure
Trained using SFTTrainer and below are the TrainingArguments.
num_train_epochs=1, # adjust based on the data size
per_device_train_batch_size=4, # use 2 or 4 if you have less GPU RAM
per_device_eval_batch_size=4,
optim="paged_adamw_32bit",
#gradient_accumulation_steps=2,
save_strategy="epoch",
evaluation_strategy="epoch",
learning_rate=2e-4,
logging_steps=1,
fp16=True,
weight_decay=0.01,
lr_scheduler_type="cosine",
seed=42,
Evaluation
Evaluated on test set of the web_questions dataset.
Testing Data
Currently tested on test set of web_questions dataset and will update soon the testing results with respect to other datasets. Thank you!!!
Metrics
Perplexity Accuracy F1 Score
Results
After 2 epochs the training loss was 1.114500 and validation loss was 1.592121.
Perplexity on test data from web_questions dataset: 5.13
- Downloads last month
- 14