YiDuo1999 commited on
Commit
c90d0c7
·
verified ·
1 Parent(s): 1cd211f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -14
README.md CHANGED
@@ -31,21 +31,33 @@ def askme(question):
31
  response_text = tokenizer.batch_decode(outputs)[0].strip()
32
  answer = response_text.split('<|im_start|>assistant')[-1].strip()
33
  return answer
 
34
  ## 🏆 Evaluation
35
  For question-answering tasks, we have
36
 
37
  | Model | MMLU-Medical | PubMedQA | MedMCQA | MedQA-4-Option | Avg |
38
- |:--------------------------------|:--------------|:----------|:---------|:----------------|:------|
39
- | Mistral-7B-instruct | 55.8 | 17.8 | 40.2 | 41.1 | 37.5 |
40
- | Zephyr-7B-instruct-β | 63.3 | 46.0 | 43.0 | 48.5 | 48.7 |
41
- | PMC-Llama-7B | 59.7 | 59.2 | 57.6 | 49.2 | 53.6 |
42
- | Medalpaca-13B | 55.2 | 50.4 | 21.2 | 20.2 | 36.7 |
43
- | AlpaCare-13B | 60.2 | 53.8 | 38.5 | 30.4 | 45.7 |
44
- | BioMedGPT-LM 7B | 52.0 | 58.6 | 34.9 | 39.3 | 46.2 |
45
- | Me-Llama-13B | - | 70.0 | 44.9 | 42.7 | - |
46
- | Llama-3-8B instruct | 82.0 | 74.6 | 57.1 | 60.3 | 68.5 |
47
- | JSL-Med-Sft-Llama-3-8B | 83.0 | 75.4 | 57.5 | 74.8 | 72.7 |
48
- | GPT-3.5-turbo-1106 | 74.0 | 72.6 | 34.9 | 39.3 | 60.6 |
49
- | GPT-4 | 85.5 | 69.2 | 69.5 | 83.9 | 77.0 |
50
- | Gemma-2-9b-int | 75.0 | 76.0 | 40.3 | 48.9 | 60.q |
51
- | Llama-3-physician-8B instruct | 80.0 | 76.0 | 80.2 | 60.3 | 74.1 |
 
 
 
 
 
 
 
 
 
 
 
 
31
  response_text = tokenizer.batch_decode(outputs)[0].strip()
32
  answer = response_text.split('<|im_start|>assistant')[-1].strip()
33
  return answer
34
+ ```
35
  ## 🏆 Evaluation
36
  For question-answering tasks, we have
37
 
38
  | Model | MMLU-Medical | PubMedQA | MedMCQA | MedQA-4-Option | Avg |
39
+ |:-------------------------------|:-------------|:---------|:--------|:---------------|:-----|
40
+ | Mistral-7B-instruct | 55.8 | 17.8 | 40.2 | 41.1 | 37.5 |
41
+ | Zephyr-7B-instruct-β | 63.3 | 46.0 | 43.0 | 48.5 | 48.7 |
42
+ | PMC-Llama-7B | 59.7 | 59.2 | 57.6 | 49.2 | 53.6 |
43
+ | Medalpaca-13B | 55.2 | 50.4 | 21.2 | 20.2 | 36.7 |
44
+ | AlpaCare-13B | 60.2 | 53.8 | 38.5 | 30.4 | 45.7 |
45
+ | BioMedGPT-LM 7B | 52.0 | 58.6 | 34.9 | 39.3 | 46.2 |
46
+ | Me-Llama-13B | - | 70.0 | 44.9 | 42.7 | - |
47
+ | Llama-3-8B instruct | 82.0 | 74.6 | 57.1 | 60.3 | 68.5 |
48
+ | JSL-Med-Sft-Llama-3-8B | 83.0 | 75.4 | 57.5 | 74.8 | 72.7 |
49
+ | GPT-3.5-turbo-1106 | 74.0 | 72.6 | 34.9 | 39.3 | 60.6 |
50
+ | GPT-4 | 85.5 | 69.2 | 69.5 | 83.9 | 77.0 |
51
+ | Gemma-2-9b-int | 75.0 | 76.0 | 40.3 | 48.9 | 60.0 |
52
+ | Gemma-2-9b-Medical | 75.0 | 76.0 | 61.3 | 59.7 | 68.0 |
53
+ | Llama-3-physician-8B instruct | 80.0 | 76.0 | 80.2 | 60.3 | 74.1 |
54
+
55
+ ## Citation
56
+ ```
57
+ @inproceedings{Guo2024EfficientCP,
58
+ title={Efficient Continual Pre-training by Mitigating the Stability Gap},
59
+ author={Yiduo Guo and Jie Fu and Huishuai Zhang and Dongyan Zhao and Yikang Shen},
60
+ year={2024},
61
+ url={https://api.semanticscholar.org/CorpusID:270688100}
62
+ }
63
+ ```