Fractalego commited on
Commit
c67dfe8
0 Parent(s):

initial commit

Browse files
.gitattributes ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.bin.* filter=lfs diff=lfs merge=lfs -text
2
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.h5 filter=lfs diff=lfs merge=lfs -text
5
+ *.tflite filter=lfs diff=lfs merge=lfs -text
6
+ *.tar.gz filter=lfs diff=lfs merge=lfs -text
7
+ *.ot filter=lfs diff=lfs merge=lfs -text
8
+ *.onnx filter=lfs diff=lfs merge=lfs -text
9
+ *.arrow filter=lfs diff=lfs merge=lfs -text
10
+ *.ftz filter=lfs diff=lfs merge=lfs -text
11
+ *.joblib filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.pb filter=lfs diff=lfs merge=lfs -text
15
+ *.pt filter=lfs diff=lfs merge=lfs -text
16
+ *.pth filter=lfs diff=lfs merge=lfs -text
17
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
README.MD ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Introduction
2
+ Code for the paper [Exploring the zero-shot limit of FewRel](https://www.aclweb.org/anthology/2020.coling-main.124). This repository implements a zero-shot relation extractor.
3
+
4
+ ## Dataset
5
+ The dataset FewRel 1.0 has been created in the paper
6
+ [ FewRel: A Large-Scale Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation](https://www.aclweb.org/anthology/D18-1514.pdf)
7
+ and is available [here](https://github.com/thunlp/FewRel).
8
+
9
+ ## Run the Extractor from the notebook
10
+ An example relation extraction is in this [notebook](/notebooks/extractor_examples.ipynb).
11
+ The extractor needs a list of candidate relations in English
12
+ ```python
13
+ relations = ['noble title', 'founding date', 'occupation of a person']
14
+ extractor = RelationExtractor(model, tokenizer, relations)
15
+ ```
16
+ Then the model ranks the surface forms by the belief that the relation
17
+ connects the entities in the text
18
+ ```python
19
+ extractor.rank(text='John Smith received an OBE', head='John Smith', tail='OBE')
20
+
21
+ [('noble title', 0.9690611883997917),
22
+ ('occupation of a person', 0.0012609362602233887),
23
+ ('founding date', 0.00024014711380004883)]
24
+ ```
25
+
26
+ ## Training
27
+ This repository contains 4 training scripts related to the 4 models in the paper.
28
+ ```bash
29
+ train_bert_large_with_squad.py
30
+ train_bert_large_without_squad.py
31
+ train_distillbert_with_squad.py
32
+ train_distillbert_without_squad.py
33
+ ```
34
+
35
+ ## Validation
36
+ There are also 4 scripts for validation
37
+ ```bash
38
+ test_bert_large_with_squad.py
39
+ test_bert_large_without_squad.py
40
+ test_distillbert_with_squad.py
41
+ test_distillbert_without_squad.py
42
+ ```
43
+
44
+ The results as in the paper are
45
+
46
+ | Model | 0-shot 5-ways | 0-shot 10-ways |
47
+ |------------------------|--------------|----------------|
48
+ |(1) Distillbert |70.1±0.5 | 55.9±0.6 |
49
+ |(2) Bert Large |80.8±0.4 | 69.6±0.5 |
50
+ |(3) Distillbert + SQUAD |81.3±0.4 | 70.0±0.2 |
51
+ |(4) Bert Large + SQUAD |86.0±0.6 | 76.2±0.4 |
52
+
53
+ ## Cite as
54
+ ```bibtex
55
+ @inproceedings{cetoli-2020-exploring,
56
+ title = "Exploring the zero-shot limit of {F}ew{R}el",
57
+ author = "Cetoli, Alberto",
58
+ booktitle = "Proceedings of the 28th International Conference on Computational Linguistics",
59
+ month = dec,
60
+ year = "2020",
61
+ address = "Barcelona, Spain (Online)",
62
+ publisher = "International Committee on Computational Linguistics",
63
+ url = "https://www.aclweb.org/anthology/2020.coling-main.124",
64
+ doi = "10.18653/v1/2020.coling-main.124",
65
+ pages = "1447--1451",
66
+ abstract = "This paper proposes a general purpose relation extractor that uses Wikidata descriptions to represent the relation{'}s surface form. The results are tested on the FewRel 1.0 dataset, which provides an excellent framework for training and evaluating the proposed zero-shot learning system in English. This relation extractor architecture exploits the implicit knowledge of a language model through a question-answering approach.",
67
+ }
68
+ ```
69
+
config.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "bert-large-uncased-whole-word-masking-finetuned-squad",
3
+ "architectures": [
4
+ "BertForQuestionAnswering"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "gradient_checkpointing": false,
8
+ "hidden_act": "gelu",
9
+ "hidden_dropout_prob": 0.1,
10
+ "hidden_size": 1024,
11
+ "initializer_range": 0.02,
12
+ "intermediate_size": 4096,
13
+ "layer_norm_eps": 1e-12,
14
+ "max_position_embeddings": 512,
15
+ "model_type": "bert",
16
+ "num_attention_heads": 16,
17
+ "num_hidden_layers": 24,
18
+ "pad_token_id": 0,
19
+ "position_embedding_type": "absolute",
20
+ "transformers_version": "4.9.1",
21
+ "type_vocab_size": 2,
22
+ "use_cache": true,
23
+ "vocab_size": 30522
24
+ }
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c1dc151ec0572af0e410699a57084c0ca32f0ae81765fc4ea63fa75a7f68a6b5
3
+ size 1341556197
special_tokens_map.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]"}
tokenizer_config.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"do_lower_case": true, "do_basic_tokenize": true, "never_split": null, "unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]", "tokenize_chinese_chars": true, "strip_accents": null, "model_max_length": 512, "special_tokens_map_file": null, "tokenizer_file": "/home/alce/.cache/huggingface/transformers/9b7535fe1c0da28aa7cc66b7f34529d984f535c401be8352f6adeb25f7870def.7f2721073f19841be16f41b0a70b600ca6b880c8f3df6f3535cbc704371bdfa4", "name_or_path": "bert-large-uncased-whole-word-masking-finetuned-squad", "tokenizer_class": "BertTokenizer"}
vocab.txt ADDED
The diff for this file is too large to render. See raw diff