Edit model card

Guide

Fine-Tuned LLaMA 3.1 Model on Stack Exchange Math Dataset

This repository contains the fine-tuned LLaMA 3.1 model using LoRA on a dataset collected from Stack Exchange Math. The model is designed to answer mathematical questions in a manner similar to Stack Exchange responses.

Model Details

Data Preparation

The dataset used for fine-tuning includes 1000 samples collected from Stack Exchange Math. Each sample consists of a question and its accepted answer.

Preprocessing

The data was preprocessed using the following steps:

  1. Loading the dataset from Hugging Face.
  2. Shuffling the dataset and selecting 1000 samples.
  3. Formatting the data into a chat template suitable for training.

Training Details

Libraries and Dependencies

from datasets import load_dataset
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, TrainingArguments, pipeline, logging
from peft import LoraConfig, PeftModel, prepare_model_for_kbit_training, get_peft_model
from google.colab import drive, userdata
import os, torch, wandb
from trl import SFTTrainer, setup_chat_format
from huggingface_hub import login

Loading Data and Model

model_name = "meta-llama/Meta-Llama-3.1-8B"
dataset_name = "blesspearl/stackexchange-math-sample"

torch_dtype = torch.float16
attn_implementation = "eager"
wandb.login(key=userdata.get("WANDB_API_KEY"))
run = wandb.init(
    project='Fine tunning LLama-3.1-8b on math-stack-exchange',
    job_type="training",
    anonymous="allow"
)

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch_dtype,
    bnb_4bit_use_double_quant=True,
)

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    quantization_config=bnb_config,
    device_map="auto",
    attn_implementation=attn_implementation
)

tokenizer = AutoTokenizer.from_pretrained(model_name)
model, tokenizer = setup_chat_format(model, tokenizer)

LoRA Configuration

peft_config = LoraConfig(
    r=16,
    lora_alpha=32,
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM",
    target_modules=['up_proj', 'down_proj', 'gate_proj', 'k_proj', 'q_proj', 'v_proj', 'o_proj']
)
model = get_peft_model(model, peft_config)

Data Preparation

dataset = load_dataset(dataset_name, split="all")
dataset = dataset.shuffle(seed=65).select(range(1000))

def format_chat_template(row):
    row_json = [{"role": "user", "content": row["question_body"]},
                {"role": "assistant", "content": row["accepted_answer"]}]
    row["text"] = tokenizer.apply_chat_template(row_json, tokenize=False)
    return row

dataset = dataset.map(format_chat_template, num_proc=4)
dataset = dataset.train_test_split(test_size=0.2)
dataset = dataset.remove_columns(["question_body", "accepted_answer"])

Training Configuration

training_arguments = TrainingArguments(
    output_dir="math-stackexchange",
    per_device_train_batch_size=1,
    per_device_eval_batch_size=1,
    gradient_accumulation_steps=2,
    optim="paged_adamw_32bit",
    num_train_epochs=1,
    evaluation_strategy="steps",
    eval_steps=0.2,
    logging_steps=1,
    warmup_steps=10,
    logging_strategy="steps",
    learning_rate=2e-4,
    fp16=False,
    bf16=False,
    group_by_length=True,
    report_to="wandb"
)

trainer = SFTTrainer(
    model=model,
    train_dataset=dataset["train"],
    eval_dataset=dataset["test"],
    peft_config=peft_config,
    max_seq_length=512,
    dataset_text_field="text",
    tokenizer=tokenizer,
    args=training_arguments,
    packing=False,
)
trainer.train()
wandb.finish()
model.config.use_cache = True

Model and Dataset

Usage

To use the fine-tuned model for inference, you can load it using the Hugging Face Transformers library and pass in your data for querying.

Example Code

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "blesspearl/math-stackexchange"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

def answer_question(question):
    inputs = tokenizer(question, return_tensors="pt")
    outputs = model.generate(**inputs)
    answer = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return answer

question = "What is the derivative of sin(x)?"
answer = answer_question(question)
print(answer)

Conclusion

This documentation provides an overview of the fine-tuning process of the LLaMA 3.1 model using LoRA on the Stack Exchange Math dataset. The model and dataset are available on Hugging Face for further use and exploration.

For any questions or issues, feel free to open an issue on the model repository.

Downloads last month
12
Safetensors
Model size
8.03B params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train blesspearl/math-stackexchange