dmis-lab
/

ANGEL_bc5cdr

PyTorch

English

bart

Model card Files Files and versions Community

dmis-lab commited on Sep 11

Commit

882c896

•

1 Parent(s): 9fda577

Update README.md

Browse files

Files changed (1) hide show

README.md +103 -3

README.md CHANGED Viewed

@@ -1,3 +1,103 @@
----
-license: gpl-3.0
----

+---
+license: gpl-3.0
+language:
+- en
+metrics:
+- accuracy
+base_model: dmis-lab/ANGEL_pretrained
+---
+# Model Card for ANGEL_bc5cdr
+This model card provides detailed information about the ANGEL_bc5cdr model, designed for biomedical entity linking.
+# Model Details
+#### Model Description
+- **Developed by:** Chanhwi Kim, Hyunjae Kim, Sihyeon Park, Jiwoo Lee, Mujeen Sung, Jaewoo Kang
+- **Model type:** Generative Biomedical Entity Linking Model
+- **Language(s):** English
+- **License:** GPL-3.0
+- **Finetuned from model:** BART-large (Base architecture)
+#### Model Sources
+- **Github Repository:** https://github.com/dmis-lab/ANGEL
+- **Paper:** https://arxiv.org/pdf/2408.16493
+# Direct Use
+ANGEL_bc5cdr is a tool specifically designed for biomedical entity linking, with a focus on identifying and linking disease mentions within BC5CDR datasets.
+To use this model, you need to set up a virtual environment and the inference code.
+Start by cloning our [ANGEL GitHub repository](https://github.com/dmis-lab/ANGEL).
+Then, run the following script to set up the environment:
+```bash
+bash script/environment/set_environment.sh
+```
+Then, if you want to run the model on a single sample, no preprocessing is required.
+Simply execute the run_sample.sh script:
+```bash
+bash script/inference/run_sample.sh bc5cdr
+```
+To modify the sample with your own example, refer to the [Direct Use](https://github.com/dmis-lab/ANGEL?tab=readme-ov-file#direct-use) section in our GitHub repository.
+If you're interested in training or evaluating the model, check out the [Fine-tuning](https://github.com/dmis-lab/ANGEL?tab=readme-ov-file#fine-tuning) section and [Evaluation](https://github.com/dmis-lab/ANGEL?tab=readme-ov-file#evaluation) section.
+# Training
+#### Training Data
+The model was trained on the BC5CDR dataset, which includes annotated disease entities.
+#### Training Procedure
+Positive-only Pre-training: Initial training using only positive examples, following the standard approach.
+Negative-aware Training: Subsequent training incorporated negative examples to improve the model's discriminative capabilities.
+# Evaluation
+### Testing Data
+The model was evaluated using BC5CDR dataset.
+### Metrics
+Accuracy at Top-1 (Acc@1): Measures the percentage of times the model's top prediction matches the correct entity.
+### Scores
+<table border="1" cellspacing="0" cellpadding="5" style="width: 100%; text-align: center; border-collapse: collapse; margin-left: 0;">
+  <thead>
+    <tr>
+      <th><b>Dataset</b></th>
+      <th><b>BioSYN</b><br>(Sung et al., 2020)</th>
+      <th><b>SapBERT</b><br>(Liu et al., 2021)</th>
+      <th><b>GenBioEL</b><br>(Yuan et al., 2022b)</th>
+      <th><b>ANGEL<br>(Ours)</b></th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td><b>BC5CDR</b></td>
+      <td>-</td>
+      <td>-</td>
+      <td>93.1</td>
+      <td><b>94.5</b></td>
+    </tr>
+  </tbody>
+</table>
+The scores of GenBioEL were reproduced.
+We excluded the performance of BioSYN and SapBERT, as they were evaluated separately on the chemical and disease subsets, differing from our settings.
+# Citation
+If you use the ANGEL_bc5cdr model, please cite:
+```bibtex
+@article{kim2024learning,
+  title={Learning from Negative Samples in Generative Biomedical Entity Linking},
+  author={Kim, Chanhwi and Kim, Hyunjae and Park, Sihyeon and Lee, Jiwoo and Sung, Mujeen and Kang, Jaewoo},
+  journal={arXiv preprint arXiv:2408.16493},
+  year={2024}
+}
+```
+# Contact
+For questions or issues, please contact chanhwi_kim@korea.ac.kr.