leaderboard-pr-bot's picture
Adding Evaluation Results
6bfa964
|
raw
history blame
5.32 kB
metadata
license: cc-by-nc-4.0
datasets:
  - starmpcc/Asclepius-Synthetic-Clinical-Notes
language:
  - en
pipeline_tag: text2text-generation
tags:
  - medical

Model Card for Model ID

This is an official model checkpoint for Asclepius-Llama2-13B (arxiv). This model is an enhanced version of Asclepius-13B, by replacing the base model with Llama-2 and increasing the max sequence length to 4096.

Model Details

Model Description

  • Model type: Clinical LLM (Large Language Model)
  • Language(s) (NLP): English
  • License: CC-BY-NC-SA 4.0
  • Finetuned from model [optional]: Llama2-13B

Model Sources [optional]

Uses

This model can perform below 8 clinical NLP tasks, with clincal notes.

  • Named Entity Recognition
  • Abbreviation Expansion
  • Relation Extraction
  • Temporal Information Extraction
  • Coreference Resolution
  • Paraphrasing
  • Summarization
  • Question Answering

Direct Use

[More Information Needed]

Downstream Use [optional]

[More Information Needed]

Out-of-Scope Use

ONLY USE THIS MODEL FOR RESEARCH PURPOSE!!

How to Get Started with the Model

prompt = """You are an intelligent clinical languge model.
Below is a snippet of patient's discharge summary and a following instruction from healthcare professional.
Write a response that appropriately completes the instruction.
The response should provide the accurate answer to the instruction, while being concise.

[Discharge Summary Begin]
{note}
[Discharge Summary End]

[Instruction Begin]
{question}
[Instruction End] 
"""

from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("starmpcc/Asclepius-Llama2-13B")
model = AutoModel.from_pretrained("starmpcc/Asclepius-Llama2-13B")

note = "This is a sample note"
question = "What is the diagnosis?"

model_input = prompt.format(note=note, question=question)
input_ids = tokenizer(model_input, return_tensors="pt").input_ids
output = model.generate(input_ids)
print(tokenizer.decode(output[0]))

Training Details

Training Data

https://huggingface.co/datasets/starmpcc/Asclepius-Synthetic-Clinical-Notes

Training Procedure

  • Initial training was conducted using causal language modeling on synthetic clinical notes.
  • It was then fine-tuned with clinical instruction-response pairs.
  • For a comprehensive overview of our methods, our upcoming paper will serve as a resource.

Training Hyperparameters

Speeds, Sizes, Times

  • Pre-Training (1 epoch): 1h 58m with 8x A100 80G
  • Instruction Fine-Tuning (3 epoch): 12h 39m with 8x A100 80G

Citation

BibTeX:

@misc{kweon2023publicly,
    title={Publicly Shareable Clinical Large Language Model Built on Synthetic Clinical Notes},
    author={Sunjun Kweon and Junu Kim and Jiyoun Kim and Sujeong Im and Eunbyeol Cho and Seongsu Bae and Jungwoo Oh and Gyubok Lee and Jong Hak Moon and Seng Chan You and Seungjin Baek and Chang Hoon Han and Yoon Bin Jung and Yohan Jo and Edward Choi},
    year={2023},
    eprint={2309.00237},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 44.85
ARC (25-shot) 55.89
HellaSwag (10-shot) 79.66
MMLU (5-shot) 52.38
TruthfulQA (0-shot) 40.76
Winogrande (5-shot) 72.69
GSM8K (5-shot) 0.15
DROP (3-shot) 12.42