File size: 1,812 Bytes
6bba74a
 
 
 
 
 
 
 
 
3cc44b6
 
 
a24a287
 
6bba74a
3cc44b6
 
6bba74a
a24a287
3cc44b6
a24a287
3cc44b6
 
 
a24a287
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
aa2e8ac
a24a287
 
 
 
 
 
 
6bba74a
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
---
license: apache-2.0
datasets:
- harishnair04/mtsamples
language:
- en
base_model:
- google/gemma-2-2b
pipeline_tag: text-generation
tags:
- trl
- sft
- quantization
- 4bit
- LoRA
---


# Model Card for Medical Transcription Model (Gemma-MedTr)

This model is a fine-tuned variant of `Gemma-2-2b`, optimized for medical transcription tasks with efficient 4-bit quantization and Low-Rank Adaptation (LoRA). It handles transcription processing, keyword extraction, and medical specialty classification.

## Model Details

- **Developed by:** Harish Nair
- **Organization:** University of Ottawa
- **License:** Apache 2.0
- **Fine-tuned from:** [Gemma-2-2b](https://huggingface.co/google/gemma-2-2b)
- **Model type:** Transformer-based language model for medical transcription processing
- **Language(s):** English

### Training Details

- **Training Loss:** Final training loss at step 10: 1.4791
- **Training Configuration:** 
  - LoRA with `r=8`, targeting specific transformer modules for adaptation.
  - 4-bit quantization using `nf4` quantization type and `bfloat16` compute precision.
- **Training Runtime:** 20.85 seconds, with approximately 1.92 samples processed per second.

## How to Use

To load and use this model, initialize it with the following configuration:
```python
import pandas as pd
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import LoraConfig, PeftModel

model_id = "harishnair04/Gemma-medtr-2b-sft"
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

tokenizer = AutoTokenizer.from_pretrained(model_id, token=access_token_read)
model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config, device_map='auto', token=access_token_read)