rongzhangibm
commited on
Commit
•
cdc8e58
1
Parent(s):
e84abea
Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,93 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
tags:
|
4 |
+
- MRC
|
5 |
+
- TyDiQA
|
6 |
+
- Natural Questions
|
7 |
+
- SQuAD
|
8 |
+
- xlm-roberta-large
|
9 |
+
language:
|
10 |
+
- multilingual
|
11 |
+
---
|
12 |
+
*Task*: MRC
|
13 |
+
|
14 |
+
# Model description
|
15 |
+
|
16 |
+
An XLM-RoBERTa Large reading comprehension model trained from the combination of TyDi, NQ, and SQuAD v1 datasets, starting from a fine-tuned [Tydi xlm-roberta-large](https://huggingface.co/PrimeQA/tydiqa-primary-task-xlm-roberta-large) model.
|
17 |
+
|
18 |
+
## Intended uses & limitations
|
19 |
+
|
20 |
+
You can use the raw model for the reading comprehension task. Biases associated with the pre-existing language model, xlm-roberta-large, that we used may be present in our fine-tuned model.
|
21 |
+
|
22 |
+
## Usage
|
23 |
+
|
24 |
+
You can use this model directly with the [PrimeQA](https://github.com/primeqa/primeqa) pipeline for reading comprehension [squad.ipynb](https://github.com/primeqa/primeqa/blob/main/notebooks/mrc/squad.ipynb).
|
25 |
+
|
26 |
+
### BibTeX entry and citation info
|
27 |
+
|
28 |
+
```bibtex
|
29 |
+
@article{kwiatkowski-etal-2019-natural,
|
30 |
+
title = "Natural Questions: A Benchmark for Question Answering Research",
|
31 |
+
author = "Kwiatkowski, Tom and
|
32 |
+
Palomaki, Jennimaria and
|
33 |
+
Redfield, Olivia and
|
34 |
+
Collins, Michael and
|
35 |
+
Parikh, Ankur and
|
36 |
+
Alberti, Chris and
|
37 |
+
Epstein, Danielle and
|
38 |
+
Polosukhin, Illia and
|
39 |
+
Devlin, Jacob and
|
40 |
+
Lee, Kenton and
|
41 |
+
Toutanova, Kristina and
|
42 |
+
Jones, Llion and
|
43 |
+
Kelcey, Matthew and
|
44 |
+
Chang, Ming-Wei and
|
45 |
+
Dai, Andrew M. and
|
46 |
+
Uszkoreit, Jakob and
|
47 |
+
Le, Quoc and
|
48 |
+
Petrov, Slav",
|
49 |
+
journal = "Transactions of the Association for Computational Linguistics",
|
50 |
+
volume = "7",
|
51 |
+
year = "2019",
|
52 |
+
address = "Cambridge, MA",
|
53 |
+
publisher = "MIT Press",
|
54 |
+
url = "https://aclanthology.org/Q19-1026",
|
55 |
+
doi = "10.1162/tacl_a_00276",
|
56 |
+
pages = "452--466",
|
57 |
+
}
|
58 |
+
```
|
59 |
+
|
60 |
+
```bibtex
|
61 |
+
@article{2016arXiv160605250R,
|
62 |
+
author = {{Rajpurkar}, Pranav and {Zhang}, Jian and {Lopyrev},
|
63 |
+
Konstantin and {Liang}, Percy},
|
64 |
+
title = "{SQuAD: 100,000+ Questions for Machine Comprehension of Text}",
|
65 |
+
journal = {arXiv e-prints},
|
66 |
+
year = 2016,
|
67 |
+
eid = {arXiv:1606.05250},
|
68 |
+
pages = {arXiv:1606.05250},
|
69 |
+
archivePrefix = {arXiv},
|
70 |
+
eprint = {1606.05250},
|
71 |
+
}
|
72 |
+
```
|
73 |
+
|
74 |
+
```bibtex
|
75 |
+
@article{clark-etal-2020-tydi,
|
76 |
+
title = "{T}y{D}i {QA}: A Benchmark for Information-Seeking Question Answering in Typologically Diverse Languages",
|
77 |
+
author = "Clark, Jonathan H. and
|
78 |
+
Choi, Eunsol and
|
79 |
+
Collins, Michael and
|
80 |
+
Garrette, Dan and
|
81 |
+
Kwiatkowski, Tom and
|
82 |
+
Nikolaev, Vitaly and
|
83 |
+
Palomaki, Jennimaria",
|
84 |
+
journal = "Transactions of the Association for Computational Linguistics",
|
85 |
+
volume = "8",
|
86 |
+
year = "2020",
|
87 |
+
address = "Cambridge, MA",
|
88 |
+
publisher = "MIT Press",
|
89 |
+
url = "https://aclanthology.org/2020.tacl-1.30",
|
90 |
+
doi = "10.1162/tacl_a_00317",
|
91 |
+
pages = "454--470",
|
92 |
+
}
|
93 |
+
```
|