English

LoNAS Model Card: lonas-bert-base-glue

The super-networks fine-tuned on BERT-base with GLUE benchmark using LoNAS.

Model Details

Information

Adapter Configuration

  • LoRA rank: 8
  • LoRA alpha: 16
  • LoRA target modules: query, value

Training and Evaluation

GLUE benchmark

Training Hyperparameters

Task RTE MRPC STS-B CoLA SST-2 QNLI QQP MNLI
Epoch 80 35 60 80 60 80 60 40
Batch size 32 32 64 64 64 64 64 64
Learning rate 3e-4 5e-4 5e-4 3e-4 3e-4 4e-4 3e-4 4e-4
Max length 128 128 128 128 128 256 128 128

How to use

Refer to https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning/tree/main/LoNAS/running_commands:

CUDA_VISIBLE_DEVICES=${DEVICES} python run_glue.py \
    --task_name ${TASK} \
    --model_name_or_path bert-base-uncased \
    --do_eval \
    --do_search \
    --per_device_eval_batch_size 64 \
    --max_seq_length ${MAX_LENGTH} \
    --lora \
    --lora_weights lonas-bert-base-glue/lonas-bert-base-${TASK} \
    --nncf_config nncf_config/glue/nncf_lonas_bert_base_${TASK}.json \
    --output_dir lonas-bert-base-glue/lonas-bert-base-${TASK}/results

Evaluation Results

Results of the optimal sub-network discoverd from the super-network:

Method Trainable Parameter Ratio GFLOPs RTE MRPC STS-B CoLA SST-2 QNLI QQP MNLI AVG
LoRA 0.27% 11.2 65.85 84.46 88.73 57.58 92.06 90.62 89.41 83.00 81.46
LoNAS 0.27% 8.0 70.76 88.97 88.28 61.12 93.23 91.21 88.55 82.00 83.02

Model Sources

Repository: https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning/tree/main/LoNAS

Paper:

Citation

@inproceedings{munoz-etal-2024-lonas,
    title = "{L}o{NAS}: Elastic Low-Rank Adapters for Efficient Large Language Models",
    author = "Munoz, Juan Pablo  and
      Yuan, Jinjie  and
      Zheng, Yi  and
      Jain, Nilesh",
    editor = "Calzolari, Nicoletta  and
      Kan, Min-Yen  and
      Hoste, Veronique  and
      Lenci, Alessandro  and
      Sakti, Sakriani  and
      Xue, Nianwen",
    booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",
    month = may,
    year = "2024",
    address = "Torino, Italia",
    publisher = "ELRA and ICCL",
    url = "https://aclanthology.org/2024.lrec-main.940",
    pages = "10760--10776",
}

License

Apache-2.0

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Dataset used to train IntelLabs/lonas-bert-base-glue