metadata

title: Submission Template
emoji: 🔥
colorFrom: yellow
colorTo: green
sdk: docker
pinned: false

Fine-tuned Emotion Model Checkpoint for Climate Disinformation Classification

Model Description

This is a lightwight RoBERTa model fine-tuned for the Frugal AI Challenge 2024, specifically for the text classification task of identifying climate disinformation. The model serves as a performance floor, randomly assigning labels to text inputs without any learning.

Intended Use

Primary intended uses: Baseline comparison for climate disinformation classification models
Primary intended users: Researchers and developers participating in the Frugal AI Challenge
Out-of-scope use cases: Not intended for production use or real-world classification tasks

Training Data

The model uses the QuotaClimat/frugalaichallenge-text-train dataset:

Size: ~6000 examples
Split: 80% train, 20% test
8 categories of climate disinformation claims

Labels

No relevant claim detected
Global warming is not happening
Not caused by humans
Not bad or beneficial
Solutions harmful/unnecessary
Science is unreliable
Proponents are biased
Fossil fuels are needed

Performance

This model is a fine-tuned version of michellejieli/emotion_text_classifier on the provided dataset for the competition. It achieves the following results on the evaluation set:

Loss: 0.2828 F1: 0.7879 Roc Auc: nan Hamming: 0.1039 Model description This model uses a lightweight RoBERTa checkpoint that has been fine-tuned on evaluating emotions to further be trained on recognizing climate disinformation.

Training procedure

Used a binarizer to tokenize the text and found a seemingly suitable model checkpoint as a good place to start!

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05 train_batch_size: 8 eval_batch_size: 8 seed: 42 optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments lr_scheduler_type: linear num_epochs: 4 Training results Framework versions Transformers 4.47.1 Pytorch 2.5.1+cu121 Datasets 3.2.0 Tokenizers 0.21.0

Metrics

Accuracy: ~12.5% (random chance with 8 classes)
Environmental Impact:
- Emissions tracked in gCO2eq
- Energy consumption tracked in Wh

Model Architecture

The model implements a random choice between the 8 possible labels, serving as the simplest possible baseline.

Environmental Impact

Environmental impact is tracked using CodeCarbon, measuring:

Carbon emissions during inference
Energy consumption during inference

This tracking helps establish a baseline for the environmental impact of model deployment and inference.

Limitations

Makes completely random predictions
No learning or pattern recognition
No consideration of input text
Serves only as a baseline reference
Not suitable for any real-world applications

Ethical Considerations

Dataset contains sensitive topics related to climate disinformation
Model makes random predictions and should not be used for actual classification
Environmental impact is tracked to promote awareness of AI's carbon footprint