🧠 Korean Medical LLM (QA-Finetuned) by Healthcare AI Research Institute of Seoul National University Hospital

Jun-y00/hari-q3-bnb-4bit는 서울대학교병원 의료 AI 연구소(HARI)에서 개발한 한국어 기반 의료 LLM을 BitsAndBytes 4bit 양자화로 양자화한 버전입니다. 주요 목적은 의료 질의응답(QA) 및 임상 추론 지원입니다.

🚀 Model Overview

Model Name: Jun-y00/hari-q3-bnb-4bit
Architecture: Large Language Model (LLM)
Fine-tuning Objective: Medical QA (Question–Answer) style generation
Primary Language: English, Korean
Domain: Clinical Medicine
Performance: Achieves 84.14% accuracy on the Korean Medical Licensing Examination (KMLE)
Key Applications:
- Clinical decision support (QA-style)
- Medical education and self-assessment tools
- Automated medical reasoning and documentation aid

📊 Training Data & Benchmark

This model was fine-tuned using a curated corpus of Korean medical QA-style data derived from publicly available, de-identified sources. The training data includes clinical guidelines, academic publications, exam-style questions, and synthetic prompts reflecting real-world clinical reasoning.

Training Data Characteristics:
- Focused on Korean-language question–answering formats relevant to clinical settings.
- Includes guideline-derived questions, de-identified case descriptions, and physician-crafted synthetic queries.
- Designed to reflect realistic diagnostic, therapeutic, and decision-making scenarios.
Benchmark Evaluation:
- KMLE-style QA benchmark(KorMedMCQA)
- non-reasoning
  - Doctor: 70.57%
  - Nurse: 81.66%
  - Pharm: 76.61%
  - Dentist: 62.27%
- reasoning
  - Doctor: 84.14%
  - Nurse: 88.50%
  - Pharm: 85.42%
  - Dentist: 68.56%
- All evaluations were conducted on de-identified, non-clinical test sets, with no real patient data involved.

⚠️ These benchmarks are provided for research purposes only and do not imply clinical safety or efficacy.

🔐 Privacy & Ethical Compliance

We strictly adhere to ethical AI development and privacy protection:

✅ The model was trained exclusively on publicly available and de-identified data.
🔒 It does not include any real patient data or personally identifiable information (PII).
⚖️ Designed for safe, responsible, and research-oriented use in healthcare AI.

⚠️ This model is intended for research and educational purposes only and should not be used to make clinical decisions.

🏥 About HARI – Healthcare AI Research Institute

The Healthcare AI Research Institute (HARI) is a pioneering research group within Seoul National University Hospital, driving innovation in medical AI.

🌍 Vision & Mission

Vision: Shaping a sustainable and healthy future through pioneering AI research.
Mission:
- Develop clinically useful, trustworthy AI technologies.
- Foster cross-disciplinary collaboration in medicine and AI.
- Lead global healthcare AI commercialization and policy frameworks.
- Educate the next generation of AI-powered medical professionals.

🧪 Research Platforms & Infrastructure

Platforms: SUPREME, SNUHUB, DeView, VitalDB, NSTRI Global Data Platform
Computing: NVIDIA H100 / A100 GPUs, Quantum AI Infrastructure
Projects:
- Clinical note summarization
- AI-powered diagnostics
- EHR automation
- Real-time monitoring via AI pipelines

🎓 AI Education Programs

Basic AI for Healthcare: Designed for clinicians and students
Advanced AI Research: Targeting senior researchers and specialists in clinical AI validation and deep learning

🤝 Collaborate with Us

We welcome collaboration with:

AI research institutions and medical universities
Healthcare startups and technology partners
Policymakers shaping AI regulation in medicine

📧 Contact: help-ds@snuh.org
🌐 Website: Seoul National University Hospital

🤗 Model Usage Example

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
# Load tokenizer and model
model_name = "snuh/hari-q3"
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
prompt = '''
### Instruction:
당신은 임상 지식을 갖춘 유능하고 신뢰할 수 있는 한국어 기반 의료 어시스턴트입니다.
사용자의 질문에 대해 정확하고 신중한 임상 추론을 바탕으로 진단 가능성을 제시해 주세요.
반드시 환자의 연령, 증상, 검사 결과, 통증 부위 등 모든 단서를 종합적으로 고려하여 추론 과정과 진단명을 제시해야 합니다.
의학적으로 정확한 용어를 사용하되, 필요하다면 일반인이 이해하기 쉬운 용어도 병행해 설명해 주세요.
### Question:
60세 남성이 복통과 발열을 호소하며 내원하였습니다.
혈액 검사 결과 백혈구 수치가 상승했고, 우측 하복부 압통이 확인되었습니다.
가장 가능성이 높은 진단명은 무엇인가요?
'''.strip()
messages = [
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=4096
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)

📄 License

Apache 2.0 License – Free for research and commercial use with attribution.

📢 Citation

If you use this model in your work, please cite:

@misc{hari-q3,
    title  = {hari-q3},
    url    = {https://huggingface.co/snuh/hari-q3},
    author = {Healthcare AI Research Institute(HARI) of Seoul National University Hospital(SNUH)},
    month  = {May},
    year   = {2025}
}

🚀 Together, we are shaping the future of AI-driven healthcare.

Downloads last month: 253

Safetensors

Model size

15B params

Tensor type

F32

F16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Jun-y00/hari-q3-bnb-4bit-hf

Base model

Qwen/Qwen3-14B-Base

Finetuned

snuh/hari-q3

Quantized

(3)

this model