bloom-1b1-emailgen / README.md
pszemraj's picture
Update README.md
397cbf3
|
raw
history blame
3.66 kB
metadata
license: bigscience-bloom-rail-1.0
tags:
  - text generation
  - generated_from_trainer
  - email generation
  - email
  - emailgen
datasets:
  - aeslc
  - postbot/multi-emails-100k
widget:
  - text: >-
      Good Morning Professor Beans,

      Hope you are doing well. I just wanted to reach out and ask if
      differential calculus will be on the exam
    example_title: email to prof
  - text: |-
      嘿<NAME>

      感谢你注册我的每周通讯。在我们开始之前,你必须确认你的电子邮件地址。.
    example_title: 通讯
  - text: >-
      Hi <NAME>,


      I hope this email finds you well. I wanted to reach out and ask about
      office hours
    example_title: office hours
  - text: >-
      Grüße <NAME>,


      Ich hoffe, du hattest einen schönen Abend beim Wurstessen der Firma. Ich
      melde mich, weil
    example_title: Wurstessen festival
  - text: |-
      Guten Morgen Harold,

      ich habe mich gefragt, wann die nächste
    example_title: event
  - text: URGENT - I need the TPS reports
    example_title: URGENT
  - text: |-
      Hoi Archibald,

      ik hoop dat deze e-mail je goed doet.
    example_title: e-mails die je vinden
  - text: |-
      Hello there.

      I just wanted to reach out and check in to
    example_title: checking in
  - text: >-
      Hello <NAME>,


      I hope this email finds you well. I wanted to reach out and see if you've
      enjoyed your time with us
    example_title: work well
  - text: >-
      Hi <NAME>,


      I hope this email finds you well. I wanted to reach out and see if we
      could catch up
    example_title: catch up
  - text: >-
      Jestem <NAME>,


      Właśnie wprowadziłem się do obszaru i chciałem dotrzeć i uzyskać kilka
      szczegółów na temat tego, gdzie mogę dostać artykuły spożywcze i
    example_title: zakupy spożywcze
parameters:
  min_length: 32
  max_length: 128
  no_repeat_ngram_size: 2
  do_sample: true
  temperature: 0.2
  top_k: 20
  top_p: 0.95
  repetition_penalty: 3.5
  length_penalty: 0.9

bloom-1b1-emailgen - v1

This model is a fine-tuned version of bigscience/bloom-1b1 on the postbot/multi-emails-100k dataset.

It achieves the following results on the evaluation set:

  • Loss: 1.7397

Model description

More information needed

Intended uses & limitations

  • this model did not have any of the original layers frozen during training
    • while this is still an area of investigation, the model likely needs to have some layers frozen during fine-tuning to retain the multilingual capabilities in balance with learning how to write emails.

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 7e-05
  • train_batch_size: 2
  • eval_batch_size: 1
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 64
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 2.0

Training results

Training Loss Epoch Step Validation Loss
1.8465 1.0 256 1.8656
1.4903 2.0 512 1.7396

details

***** eval metrics *****  

  epoch                   =        2.0  
  eval_loss               =     1.7397
  eval_runtime            = 0:04:27.41
  eval_samples            =       4216
  eval_samples_per_second =     15.766
  eval_steps_per_second   =     15.766
  perplexity              =     5.6956

Framework versions

  • Transformers 4.25.0.dev0
  • Pytorch 1.13.0+cu117
  • Datasets 2.6.1
  • Tokenizers 0.13.1