VMware
/

tinyroberta-mrqa

Question Answering

Inference Endpoints

Model card Files Files and versions Community

Teja-Gollapudi commited on Feb 23, 2023

Commit

c51d22f

•

1 Parent(s): 1477167

Update README.md

Files changed (1) hide show

README.md +78 -3

README.md CHANGED Viewed

@@ -1,3 +1,78 @@
----
-license: apache-2.0
----

+# tinyroberta-mrqa
+This is the *distilled* version of the [VMware/roberta-large-mrqa](https://huggingface.co/VMware/roberta-large-mrqa) model. This model has a comparable prediction quality to the base model and runs twice as fast.
+## Overview
+**Language model:** tinyroberta-mrqa
+**Language:** English
+**Downstream-task:** Extractive QA
+**Training data:** MRQA
+**Eval data:** MRQA
+## Hyperparameters
+### Distillation Hyperparameters
+```
+batch_size = 96
+n_epochs = 4
+base_LM_model = "deepset/tinyroberta-squad2-step1"
+max_seq_len = 384
+learning_rate = 3e-5
+lr_schedule = LinearWarmup
+warmup_proportion = 0.2
+doc_stride = 128
+max_query_length = 64
+distillation_loss_weight = 0.75
+temperature = 1.5
+teacher = "VMware/roberta-large-mrqa"
+```
+### Finetunning Hyperparameters
+We have finetuned on the MRQA training set.
+```
+    learning_rate=1e-5,
+    num_train_epochs=3,
+    weight_decay=0.01,
+    per_device_train_batch_size=16,
+    n_gpus = 3
+```
+## Distillation
+This model is inspired by deepset/tinyroberta-squad2.
+We start with a base checkpoint of [deepset/roberta-base-squad2](https://huggingface.co/deepset/roberta-base-squad2) and perform further task prediction layer distillation on [VMware/roberta-large-mrqa](https://huggingface.co/VMware/roberta-large-mrqa).
+We then fine-tune it on MRQA.
+## Usage
+### In Transformers
+```python
+from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline
+model_name = "VMware/tinyroberta-mrqa"
+# a) Get predictions
+nlp = pipeline('question-answering', model=model_name, tokenizer=model_name)
+QA_input = {
+    'question': '',
+    'context': ''
+}
+res = nlp(QA_input)
+# b) Load model & tokenizer
+model = AutoModelForQuestionAnswering.from_pretrained(model_name)
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+```
+## Performance
+We have Evaluated the model on the MRQA dev set and test set using SQUAD metrics.
+```
+eval exact match: 69.2
+eval f1 score: 79.6
+test exact match: 52.8
+test f1 score: 63.4
+```