|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- stanfordnlp/sst2 |
|
metrics: |
|
- precision |
|
- f1 |
|
- recall |
|
- accuracy |
|
base_model: |
|
- meta-llama/Llama-3.2-1B |
|
library_name: transformers |
|
tags: |
|
- llama |
|
- gemma |
|
- fine |
|
- tuned |
|
- sst |
|
--- |
|
|
|
1. Overview |
|
This repository showcases the fine-tuning of the Llama-3.2-1B model on the SST-2 dataset. The objective is to perform binary sentiment classification, identifying whether a given sentence expresses a positive or negative sentiment. The fine-tuning process focuses on task-specific optimization, transforming the pre-trained Llama model into a powerful sentiment analysis tool. |
|
|
|
2. Model Information |
|
Model Used: meta-llama/Llama-3.2-1B |
|
Pre-trained Parameters: The model comprises approximately 1.03 billion parameters, confirmed through code inspection and consistent with the official documentation. |
|
Fine-tuned Parameters: The parameter count remains unchanged during fine-tuning, as the task updates existing model weights without adding new layers or parameters. |
|
|
|
3. Dataset and Task Details |
|
Dataset: SST-2 |
|
The Stanford Sentiment Treebank (SST-2) dataset is widely used for binary sentiment classification tasks. |
|
The dataset consists of sentences labeled as either positive or negative sentiment. |
|
Task Objective |
|
Train the model to classify sentences into the appropriate sentiment category based on contextual cues. |
|
|
|
4. Fine-Tuning Approach |
|
Train-Test Split: The dataset was split into an 80:20 ratio using stratified sampling to ensure balanced representation of sentiment classes. |
|
Tokenization: Input text was tokenized with padding and truncation to a maximum length of 128 tokens. |
|
Model Training: Fine-tuning involved updating task-specific weights over three epochs with a learning rate of 2e-5. |
|
Hardware: Training was performed on GPU-enabled hardware for accelerated computations. |
|
|
|
5. Results and Observations |
|
Zero-shot vs. Fine-tuned Performance: The pre-trained Llama model in its zero-shot state exhibited moderate performance on SST-2. After fine-tuning, the model achieved significant improvements in its ability to classify sentiments accurately. |
|
|
|
Fine-tuning Benefits: Task-specific training allowed the model to better understand contextual nuances in the data, resulting in enhanced sentiment classification capabilities. |
|
|
|
Model Parameters: The total number of parameters did not change during fine-tuning, indicating that the improvement in performance is attributed solely to the updated weights. |
|
|
|
6. How to Use the Fine-Tuned Model |
|
Install Necessary Libraries: |
|
|
|
pip install transformers datasets |
|
Load the Fine-Tuned Model: |
|
|
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
|
|
model_name = "<your-huggingface-repo>/sst2-llama-finetuned" |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForSequenceClassification.from_pretrained(model_name) |
|
Make Predictions: |
|
|
|
text = "The movie was absolutely fantastic!" |
|
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True) |
|
outputs = model(**inputs) |
|
sentiment = "Positive" if outputs.logits.argmax() == 1 else "Negative" |
|
print(f"Predicted Sentiment: {sentiment}") |
|
|
|
7. Key Takeaways |
|
Fine-tuning the Llama model for SST-2 significantly enhances its performance on binary sentiment classification tasks. |
|
The parameter count of the model remains constant during fine-tuning, demonstrating that improvements are achieved by optimizing existing weights. |
|
This work highlights the adaptability of Llama for downstream NLP tasks when fine-tuned on task-specific datasets. |
|
|
|
8. Acknowledgments |
|
Hugging Face Transformers library for facilitating model fine-tuning. |
|
Stanford Sentiment Treebank for providing a robust dataset for sentiment classification. |
|
|