Update README.md
Browse files
README.md
CHANGED
@@ -13,6 +13,33 @@ tags:
|
|
13 |
The main objective of this model is to enhance performance in tasks related to medical dialogue
|
14 |
and question-answering.
|
15 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
16 |
## Usage
|
17 |
The model is compatible with the huggingface `AutoModelForCausalLM` and can be easily run on a single 40GB A100.
|
18 |
|
@@ -65,8 +92,61 @@ print(output)
|
|
65 |
# The use of vaccines has led to a significant reduction in the incidence and severity of many diseases, including measles, mumps, rubella, and polio.
|
66 |
```
|
67 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
68 |
## Limitation
|
69 |
The model may not operate efficiently beyond the confines of the healthcare field.
|
70 |
Since it has not been subjected to practical scenarios, its real-time efficacy and precision remain undetermined.
|
71 |
Under no circumstances should it replace the advice of a medical professional, and it must be regarded solely as a tool for research purposes.
|
72 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
13 |
The main objective of this model is to enhance performance in tasks related to medical dialogue
|
14 |
and question-answering.
|
15 |
|
16 |
+
- **Developed by:** [https://writer.com/](https://writer.com/);
|
17 |
+
- **Model type:** Causal decoder-only;
|
18 |
+
- **Language(s) (NLP):** English;
|
19 |
+
- **License:** Apache 2.0;
|
20 |
+
- **Finetuned from model:** [Palmyra-20B](https://huggingface.co/Writer/palmyra-large).
|
21 |
+
|
22 |
+
### Model Source
|
23 |
+
|
24 |
+
[Palmyra-Med: Instruction-Based Fine-Tuning of LLMs Enhancing Medical Domain Performance](https://dev.writer.com/docs/palmyra-med-instruction-based-fine-tuning-of-llms-enhancing-medical-domain-performance)
|
25 |
+
|
26 |
+
|
27 |
+
## Uses
|
28 |
+
|
29 |
+
|
30 |
+
### Out-of-Scope Use
|
31 |
+
|
32 |
+
Production use without adequate assessment of risks and mitigation; any use cases which may be considered irresponsible or harmful.
|
33 |
+
|
34 |
+
## Bias, Risks, and Limitations
|
35 |
+
|
36 |
+
Palmyra-Med-20B is mostly trained on English data, and will not generalize appropriately to other languages. Furthermore, as it is trained on a large-scale corpora representative of the web, it will carry the stereotypes and biases commonly encountered online.
|
37 |
+
|
38 |
+
### Recommendations
|
39 |
+
|
40 |
+
We recommend users of Palmyra-Med-20B to develop guardrails and to take appropriate precautions for any production use.
|
41 |
+
|
42 |
+
|
43 |
## Usage
|
44 |
The model is compatible with the huggingface `AutoModelForCausalLM` and can be easily run on a single 40GB A100.
|
45 |
|
|
|
92 |
# The use of vaccines has led to a significant reduction in the incidence and severity of many diseases, including measles, mumps, rubella, and polio.
|
93 |
```
|
94 |
|
95 |
+
## Dataset
|
96 |
+
For the fine-tuning of our LLMs, we used a custom-curated medical dataset that combines data from
|
97 |
+
two publicly available sources: PubMedQA (Jin et al. 2019) and MedQA (Zhang et al. 2018).The
|
98 |
+
PubMedQA dataset, which originated from the PubMed abstract database, consists of biomedical
|
99 |
+
articles accompanied by corresponding question-answer pairs. In contrast, the MedQA dataset
|
100 |
+
features medical questions and answers that are designed to assess the reasoning capabilities of
|
101 |
+
medical question-answering systems.
|
102 |
+
We prepared our custom dataset by merging and processing data from the aforementioned sources,
|
103 |
+
maintaining the dataset mixture ratios detailed in Table 1. These ratios were consistent for finetuning
|
104 |
+
both Palmyra-20b and Palmyra-40b models. Upon fine-tuning the models with this dataset, we refer
|
105 |
+
to the resulting models as Palmyra-Med-20b and Palmyra-Med-40b, respectively.
|
106 |
+
|
107 |
+
Dataset Ratio Count
|
108 |
+
PubMedQA 75%
|
109 |
+
MedQA 25%
|
110 |
+
|
111 |
+
| Dataset | Ratio | Count |
|
112 |
+
| -----------|----------- | ----------- |
|
113 |
+
| PubMedQA | 75% | 150,000 |
|
114 |
+
| MedQA | 25% | 10,178 |
|
115 |
+
|
116 |
+
|
117 |
+
## Evaluation
|
118 |
+
we present the findings of our experiments, beginning with the evaluation outcomes of
|
119 |
+
the fine-tuned models and followed by a discussion of the base models’ performance on each of the
|
120 |
+
evaluation datasets. Additionally, we report the progressive improvement of the Palmyra-Med-40b
|
121 |
+
model throughout the training process on the PubMedQA dataset.
|
122 |
+
|
123 |
+
| Model | PubMedQA | MedQA |
|
124 |
+
| -----------|----------- | ----------- |
|
125 |
+
| Palmyra-20b | 49.8 | 31.2 |
|
126 |
+
| Palmyra-40b | 64.8 | 43.1|
|
127 |
+
| Palmyra-Med-20b| 75.6 | 44.6|
|
128 |
+
| Palmyra-Med-40b| 81.1 | 72.4|
|
129 |
+
|
130 |
+
|
131 |
+
|
132 |
## Limitation
|
133 |
The model may not operate efficiently beyond the confines of the healthcare field.
|
134 |
Since it has not been subjected to practical scenarios, its real-time efficacy and precision remain undetermined.
|
135 |
Under no circumstances should it replace the advice of a medical professional, and it must be regarded solely as a tool for research purposes.
|
136 |
|
137 |
+
## Citation and Related Information
|
138 |
+
|
139 |
+
|
140 |
+
To cite this model:
|
141 |
+
```
|
142 |
+
@misc{Palmyra-Med-20B,
|
143 |
+
author = {Writer Engineering team},
|
144 |
+
title = {{Palmyra-Large Parameter Autoregressive Language Model}},
|
145 |
+
howpublished = {\url{https://dev.writer.com}},
|
146 |
+
year = 2023,
|
147 |
+
month = March
|
148 |
+
}
|
149 |
+
```
|
150 |
+
|
151 |
+
## Contact
|
152 |
+
Hello@writer.com
|