license: apache-2.0
datasets:
- stanfordnlp/sst2
metrics:
- precision
- f1
- recall
- accuracy
base_model:
- meta-llama/Llama-3.2-1B
library_name: transformers
tags:
- llama
- gemma
- fine
- tuned
- sst
Overview This repository showcases the fine-tuning of the Llama-3.2-1B model on the SST-2 dataset. The objective is to perform binary sentiment classification, identifying whether a given sentence expresses a positive or negative sentiment. The fine-tuning process focuses on task-specific optimization, transforming the pre-trained Llama model into a powerful sentiment analysis tool.
Model Information Model Used: meta-llama/Llama-3.2-1B Pre-trained Parameters: The model comprises approximately 1.03 billion parameters, confirmed through code inspection and consistent with the official documentation. Fine-tuned Parameters: The parameter count remains unchanged during fine-tuning, as the task updates existing model weights without adding new layers or parameters.
Dataset and Task Details Dataset: SST-2 The Stanford Sentiment Treebank (SST-2) dataset is widely used for binary sentiment classification tasks. The dataset consists of sentences labeled as either positive or negative sentiment. Task Objective Train the model to classify sentences into the appropriate sentiment category based on contextual cues.
Fine-Tuning Approach Train-Test Split: The dataset was split into an 80:20 ratio using stratified sampling to ensure balanced representation of sentiment classes. Tokenization: Input text was tokenized with padding and truncation to a maximum length of 128 tokens. Model Training: Fine-tuning involved updating task-specific weights over three epochs with a learning rate of 2e-5. Hardware: Training was performed on GPU-enabled hardware for accelerated computations.
Results and Observations Zero-shot vs. Fine-tuned Performance: The pre-trained Llama model in its zero-shot state exhibited moderate performance on SST-2. After fine-tuning, the model achieved significant improvements in its ability to classify sentiments accurately.
Fine-tuning Benefits: Task-specific training allowed the model to better understand contextual nuances in the data, resulting in enhanced sentiment classification capabilities.
Model Parameters: The total number of parameters did not change during fine-tuning, indicating that the improvement in performance is attributed solely to the updated weights.
- How to Use the Fine-Tuned Model Install Necessary Libraries:
pip install transformers datasets Load the Fine-Tuned Model:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
model_name = "/sst2-llama-finetuned" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForSequenceClassification.from_pretrained(model_name) Make Predictions:
text = "The movie was absolutely fantastic!" inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True) outputs = model(**inputs) sentiment = "Positive" if outputs.logits.argmax() == 1 else "Negative" print(f"Predicted Sentiment: {sentiment}")
Key Takeaways Fine-tuning the Llama model for SST-2 significantly enhances its performance on binary sentiment classification tasks. The parameter count of the model remains constant during fine-tuning, demonstrating that improvements are achieved by optimizing existing weights. This work highlights the adaptability of Llama for downstream NLP tasks when fine-tuned on task-specific datasets.
Acknowledgments Hugging Face Transformers library for facilitating model fine-tuning. Stanford Sentiment Treebank for providing a robust dataset for sentiment classification.