File size: 3,689 Bytes
2398dab
 
 
 
 
 
 
 
3d4aaac
2398dab
 
002e3e3
3d4aaac
 
 
 
 
 
c3e8268
85b65d3
 
 
 
 
 
 
 
84aa4ca
85b65d3
 
 
 
 
 
84aa4ca
85b65d3
 
 
 
 
84aa4ca
85b65d3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
84aa4ca
85b65d3
 
 
 
84aa4ca
85b65d3
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
---
license: apache-2.0
datasets:
- stanfordnlp/sst2
metrics:
- precision
- f1
- recall
- accuracy
base_model:
- meta-llama/Llama-3.2-1B
library_name: transformers
tags:
- llama
- gemma
- fine
- tuned
- sst
---

1. Overview
This repository showcases the fine-tuning of the Llama-3.2-1B model on the SST-2 dataset. The objective is to perform binary sentiment classification, identifying whether a given sentence expresses a positive or negative sentiment. The fine-tuning process focuses on task-specific optimization, transforming the pre-trained Llama model into a powerful sentiment analysis tool.

2. Model Information
Model Used: meta-llama/Llama-3.2-1B
Pre-trained Parameters: The model comprises approximately 1.03 billion parameters, confirmed through code inspection and consistent with the official documentation.
Fine-tuned Parameters: The parameter count remains unchanged during fine-tuning, as the task updates existing model weights without adding new layers or parameters.

3. Dataset and Task Details
Dataset: SST-2
The Stanford Sentiment Treebank (SST-2) dataset is widely used for binary sentiment classification tasks.
The dataset consists of sentences labeled as either positive or negative sentiment.
Task Objective
Train the model to classify sentences into the appropriate sentiment category based on contextual cues.

4. Fine-Tuning Approach
Train-Test Split: The dataset was split into an 80:20 ratio using stratified sampling to ensure balanced representation of sentiment classes.
Tokenization: Input text was tokenized with padding and truncation to a maximum length of 128 tokens.
Model Training: Fine-tuning involved updating task-specific weights over three epochs with a learning rate of 2e-5.
Hardware: Training was performed on GPU-enabled hardware for accelerated computations.

5. Results and Observations
Zero-shot vs. Fine-tuned Performance: The pre-trained Llama model in its zero-shot state exhibited moderate performance on SST-2. After fine-tuning, the model achieved significant improvements in its ability to classify sentiments accurately.

Fine-tuning Benefits: Task-specific training allowed the model to better understand contextual nuances in the data, resulting in enhanced sentiment classification capabilities.

Model Parameters: The total number of parameters did not change during fine-tuning, indicating that the improvement in performance is attributed solely to the updated weights.

6. How to Use the Fine-Tuned Model
Install Necessary Libraries:

pip install transformers datasets
Load the Fine-Tuned Model:

from transformers import AutoTokenizer, AutoModelForSequenceClassification

model_name = "<your-huggingface-repo>/sst2-llama-finetuned"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
Make Predictions:

text = "The movie was absolutely fantastic!"
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
outputs = model(**inputs)
sentiment = "Positive" if outputs.logits.argmax() == 1 else "Negative"
print(f"Predicted Sentiment: {sentiment}")

7. Key Takeaways
Fine-tuning the Llama model for SST-2 significantly enhances its performance on binary sentiment classification tasks.
The parameter count of the model remains constant during fine-tuning, demonstrating that improvements are achieved by optimizing existing weights.
This work highlights the adaptability of Llama for downstream NLP tasks when fine-tuned on task-specific datasets.

8. Acknowledgments
Hugging Face Transformers library for facilitating model fine-tuning.
Stanford Sentiment Treebank for providing a robust dataset for sentiment classification.