File size: 7,334 Bytes
5094eea
 
 
 
 
 
 
 
 
7b7aef8
 
 
 
 
 
 
 
 
d545865
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7b7aef8
5094eea
 
e6ccaa3
5094eea
92173f9
 
a2b4c74
5094eea
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e6ccaa3
97f9cec
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e6ccaa3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
---
base_model:
- Undi95/Llama-3-Unholy-8B
- Locutusque/llama-3-neural-chat-v1-8b
- ruslanmv/Medical-Llama3-8B-16bit
library_name: transformers
tags:
- mergekit
- merge
license: other
datasets:
- mlabonne/orpo-dpo-mix-40k
- Open-Orca/SlimOrca-Dedup
- jondurbin/airoboros-3.2
- microsoft/orca-math-word-problems-200k
- m-a-p/Code-Feedback
- MaziyarPanahi/WizardLM_evol_instruct_V2_196k
- ruslanmv/ai-medical-chatbot
model-index:
  - name: Medichat-Llama3-8B
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: AI2 Reasoning Challenge (25-Shot)
          type: ai2_arc
          config: ARC-Challenge
          split: test
          args:
            num_few_shot: 25
        metrics:
          - type: acc_norm
            value: 59.13
            name: normalized accuracy
        source:
          url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=sethuiyer/Medichat-Llama3-8B
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: HellaSwag (10-Shot)
          type: hellaswag
          split: validation
          args:
            num_few_shot: 10
        metrics:
          - type: acc_norm
            value: 82.90
            name: normalized accuracy
        source:
          url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=sethuiyer/Medichat-Llama3-8B
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MMLU (5-Shot)
          type: cais/mmlu
          config: all
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 60.35
            name: accuracy
        source:
          url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=sethuiyer/Medichat-Llama3-8B
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: TruthfulQA (0-shot)
          type: truthful_qa
          config: multiple_choice
          split: validation
          args:
            num_few_shot: 0
        metrics:
          - type: mc2
            value: 49.65
        source:
          url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=sethuiyer/Medichat-Llama3-8B
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: Winogrande (5-shot)
          type: winogrande
          config: winogrande_xl
          split: validation
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 78.93
            name: accuracy
        source:
          url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=sethuiyer/Medichat-Llama3-8B
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: GSM8k (5-shot)
          type: gsm8k
          config: main
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 60.35
            name: accuracy
        source:
          url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=sethuiyer/Medichat-Llama3-8B
          name: Open LLM Leaderboard

---

### Medichat-Llama3-8B

Built upon the powerful LLaMa-3 architecture and fine-tuned on an extensive dataset of health information, this model leverages its vast medical knowledge to offer clear, comprehensive answers.


The following YAML configuration was used to produce this model:

```yaml

models:
  - model: Undi95/Llama-3-Unholy-8B
    parameters:
      weight: [0.25, 0.35, 0.45, 0.35, 0.25]
      density: [0.1, 0.25, 0.5, 0.25, 0.1]
  - model: Locutusque/llama-3-neural-chat-v1-8b
  - model: ruslanmv/Medical-Llama3-8B-16bit
    parameters:
      weight: [0.55, 0.45, 0.35, 0.45, 0.55]
      density: [0.1, 0.25, 0.5, 0.25, 0.1]
merge_method: dare_ties
base_model: Locutusque/llama-3-neural-chat-v1-8b
parameters:
  int8_mask: true
dtype: bfloat16

```

# Comparision Against Dr.Samantha 7B

| Subject                 | Medichat-Llama3-8B Accuracy (%) | Dr. Samantha Accuracy (%) |
|-------------------------|---------------------------------|---------------------------|
| Clinical Knowledge      | 71.70                           | 52.83                     |
| Medical Genetics        | 78.00                           | 49.00                     |
| Human Aging             | 70.40                           | 58.29                     |
| Human Sexuality         | 73.28                           | 55.73                     |
| College Medicine        | 62.43                           | 38.73                     |
| Anatomy                 | 64.44                           | 41.48                     |
| College Biology         | 72.22                           | 52.08                     |
| High School Biology     | 77.10                           | 53.23                     |
| Professional Medicine   | 63.97                           | 38.73                     |
| Nutrition               | 73.86                           | 50.33                     |
| Professional Psychology | 68.95                           | 46.57                     |
| Virology                | 54.22                           | 41.57                     |
| High School Psychology  | 83.67                           | 66.60                     |
| **Average**             | **70.33**                       | **48.85**                 |


The current model demonstrates a substantial improvement over the previous [Dr. Samantha](sethuiyer/Dr_Samantha-7b) model in terms of subject-specific knowledge and accuracy.

### Usage:
```python
from transformers import AutoTokenizer, AutoModelForCausalLM

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("sethuiyer/Medichat-Llama3-8B")
model = AutoModelForCausalLM.from_pretrained("sethuiyer/Medichat-Llama3-8B").to("cuda")

# Function to format and generate response with prompt engineering using a chat template
def askme(question):
    sys_message = ''' 
    You are an AI Medical Assistant trained on a vast dataset of health information. Please be thorough and
    provide an informative answer. If you don't know the answer to a specific medical inquiry, advise seeking professional help.
    '''

    # Create messages structured for the chat template
    messages = [{"role": "system", "content": sys_message}, {"role": "user", "content": question}]

    # Applying chat template
    prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
    inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
    outputs = model.generate(**inputs, max_new_tokens=512, use_cache=True)  # Adjust max_new_tokens for longer responses

    # Extract and return the generated text
    answer = tokenizer.batch_decode(outputs)[0].strip()
    return answer

# Example usage
question = '''
Symptoms:
Dizziness, headache and nausea.

What is the differnetial diagnosis?
'''
print(askme(question))
```