README.md · harishnair04/Gemma-medtr-2b-sft at a24a28741210da9cbbe9f0af052d8d2dd5264f48

File size: 1,689 Bytes

3cc44b6
 
 
 
a24a287
 
 
3cc44b6
 
a24a287
3cc44b6
a24a287
3cc44b6
 
 
a24a287

library_name: transformers
tags:
- trl
- sft
- quantization
- 4bit
- lora
---

# Model Card for Medical Transcription Model (Gemma-MedTr)

This model is a fine-tuned variant of `Gemma-2-2b`, optimized for medical transcription tasks with efficient 4-bit quantization and Low-Rank Adaptation (LoRA). It handles transcription processing, keyword extraction, and medical specialty classification.

## Model Details

- **Developed by:** Harish Nair
- **Organization:** University of Ottawa
- **License:** Apache 2.0
- **Fine-tuned from:** [Gemma-2-2b](https://huggingface.co/google/gemma-2-2b)
- **Model type:** Transformer-based language model for medical transcription processing
- **Language(s):** English

### Training Details

- **Training Loss:** Final training loss at step 10: 1.4791
- **Training Configuration:** 
  - LoRA with `r=8`, targeting specific transformer modules for adaptation.
  - 4-bit quantization using `nf4` quantization type and `bfloat16` compute precision.
- **Training Runtime:** 20.85 seconds, with approximately 1.92 samples processed per second.

## How to Use

To load and use this model, initialize it with the following configuration:
```python
import pandas as pd
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import LoraConfig, PeftModel

model_id = "google/gemma-2-2b"
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

tokenizer = AutoTokenizer.from_pretrained(model_id, token=access_token_read)
model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config, device_map='auto', token=access_token_read)