trevorkwan's picture
Update README.md
92dcab5
|
raw
history blame
2.15 kB
metadata
license: cc0-1.0
tags:
  - generated_from_trainer
datasets:
  - squad_v2
model-index:
  - name: bluebert_pubmed_mimic_uncased_squadv2
    results: []

bluebert_pubmed_mimic_uncased_squadv2

This model is a fine-tuned version of bionlp/bluebert_pubmed_mimic_uncased_L-12_H-768_A-12 on the squad_v2 dataset.

Intended uses & limitations

This is the first model on huggingface that combines MIMIC data (https://mimic.mit.edu/) with squadv2 (https://huggingface.co/datasets/squad_v2) for question answering purposes.

Training and evaluation data

Training procedure

Tuning script used (.bat file):

@echo off

set BASE_MODEL=bionlp/bluebert_pubmed_mimic_uncased_L-12_H-768_A-12
set OUTPUT_DIR=U:\Documents\Breast_Non_Synoptic\results\pretrained\bluebert_pubmed_mimic_uncased_squadv2\

python run_qa.py ^
  --model_name_or_path  %BASE_MODEL% ^
  --dataset_name squad_v2 ^
  --do_train ^
  --do_eval ^
  --version_2_with_negative ^
  --per_device_train_batch_size 16 ^
  --learning_rate 2e-5 ^
  --num_train_epochs 3 ^
  --max_seq_length 480 ^
  --doc_stride 64 ^
  --weight_decay 0.01 ^
  --output_dir %OUTPUT_DIR%

You may need to adapt this script for non-Windows operating systems.

The run_qa.py example script can be found here.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3.0

Framework versions

  • Transformers 4.29.2
  • Pytorch 2.0.1
  • Datasets 2.14.4
  • Tokenizers 0.13.2