BiomedBERT Hash Nano
This is a 970K parameter BERT encoder-only model trained on data from PubMed. The raw data was transformed using PaperETL with the results stored as a local dataset via the Hugging Face Datasets library.
biomedbert-hash-nano is built with the BERT Hash architecture as described in the following links.
Usage
biomedbert-hash-nano can be loaded using Hugging Face Transformers as follows. Note that given that this is a custom architecture, trust_remote_code needs to be set.
from transformers import AutoModel
model = AutoModel.from_pretrained("neuml/biomedbert-hash-nano", trust_remote_code=True)
The model is intended to be further fine-tuned for a specific task such as Text Classification, Entity Extraction, Sentence Embeddings and so on.
Evaluation Results
This Medical Abstracts Text Classification Dataset was used to evaluate the model's performance. A handful of biomedical models and general models were selected for comparison.
Metrics were generated using Hugging Face's standard run_glue script as shown below.
python run_glue.py --model_name_or_path neuml/biomedbert-hash-nano --dataset-name medclassify --do_train --do_eval --max_seq_length 128 --per_device_train_batch_size 32 --learning_rate 1e-4 --num_train_epochs 4 --output_dir outputs --trust-remote-code True
Note: The original dataset was saved locally as medclassify the the condition_label column renamed to label to work more easily with the glue script
| Model | Parameters | Accuracy | Loss |
|---|---|---|---|
| biomedbert-hash-nano | 0.969M | 0.6195 | 0.9464 |
| bert-hash-nano | 0.969M | 0.5045 | 1.2192 |
| bert-base-uncased | 110M | 0.6118 | 0.9712 |
| biomedbert-base | 110M | 0.6195 | 0.9037 |
| ModernBERT-base | 149M | 0.5672 | 1.1079 |
| BioClinical-ModernBERT-base | 149M | 0.5679 | 1.0915 |
As we can see, this model performs very well against models much larger in size. This dataset is a challenging one!
- Downloads last month
- 21