Bloom-1b7-creative-writing

This model is a fine-tuned version of bigscience/bloom-1b7 on the adambjorn/UnrelatedForgettingOverhead creative writing dataset.

Model description

More information needed

Intended uses & limitations

Intended for use on a student group project for Portland State University's Winter 2024 LLMs Course.

Training and evaluation data

Instruction Tuned on the creative writing dataset here: https://huggingface.co/datasets/adambjorn/UnrelatedForgettingOverhead/viewer/creative

Training procedure

Trained on a single RTX 3090 card.

Given a set of prompts:

prompts = [
    "Write a creative short story based on the following title:",
    "Here is a title for a story. Craft a short narrative around it:",
    "Using the title given, develop a short story:",
    "Imagine a short story that starts with this title:",
    "Create a brief story with the following title:"
]

Concatenate the prompt, the title and the story like so:

concatenated_texts = [random.choice(prompts) + " " + title + "</s>" + "Story: " + selftext for title, selftext in zip(titles, selftexts)]

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 1
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 4
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5
  • mixed_precision_training: Native AMP

Training results

Final results: {'loss': 0.0472, 'learning_rate': 1.4893617021276598e-06, 'epoch': 4.95}

Average results: {'train_runtime': 563.2707, 'train_samples_per_second': 1.687, 'train_steps_per_second': 0.417, 'train_loss': 0.8475136074614018, 'epoch': 4.95}

Framework versions

  • Transformers 4.37.2
  • Pytorch 2.2.0+cu121
  • Datasets 2.17.0
  • Tokenizers 0.15.2
Downloads last month
60
Safetensors
Model size
1.72B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for alonzogarbanzo/Bloom-1b7-creative-writing

Finetuned
(9)
this model
Quantizations
2 models