license: mit
base_model: roberta-base
tags:
- generated_from_trainer
metrics:
- accuracy
model-index:
- name: storyseeker
results: []
🔭StorySeeker
This model is a fine-tuned version of roberta-base on the 🔭StorySeeker dataset. It achieves the following results on the evaluation set:
- Loss: 0.4343
- Accuracy: 0.8416
Citation
If you use our data, codebook, or models, please cite the following preprint:
Where do people tell stories online? Story Detection Across Online Communities
Maria Antoniak, Joel Mire, Maarten Sap, Elliott Ash, Andrew Piper
Model description
This model can be used to predict whether a text contains or does not contain a story.
For our definition of "story" please refer to our codebook.
Quick Start with Colab
You can view a demonstration of how to load our annotations, fetch the texts, load our fine-tuned model from Hugging Face, and run predictions. If you use the Colab link, you don't need to download anything or set up anything on your local machine; everything will run in your internet browser.
Colab: link
Github: link
Intended uses & limitations
This model is intended for researchers interested in measuring storytelling in online communities, though it can be applied to other kinds of datasets (see generalization results in our preprint).
Training and evaluation data
The model was fine-tuned on the training split of the 🔭StorySeeker dataset, which contains 301 Reddit posts and comments annotated with story and event spans. This model was fine-tuned using binary document labels (the document contains a story or does not contain a story).
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 16
- eval_batch_size: 20
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 20
- num_epochs: 3
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
0.6969 | 0.53 | 10 | 0.7059 | 0.4158 |
0.6942 | 1.05 | 20 | 0.6674 | 0.6139 |
0.602 | 1.58 | 30 | 0.4691 | 0.7921 |
0.4826 | 2.11 | 40 | 0.4711 | 0.7921 |
0.2398 | 2.63 | 50 | 0.4685 | 0.8119 |
Framework versions
- Transformers 4.35.2
- Pytorch 2.1.0+cu121
- Tokenizers 0.15.2