Ramon Meffert
commited on
Commit
·
9889a50
1
Parent(s):
1f08ed2
Update readme
Browse files- README.md +60 -76
- README.old.md +93 -0
README.md
CHANGED
@@ -1,93 +1,77 @@
|
|
1 |
-
#
|
2 |
|
3 |
-
##
|
4 |
|
5 |
-
|
6 |
-
- [ ] Formules enzo eruit filteren
|
7 |
-
- [ ] Splitsen op zinnen...?
|
8 |
-
- [ ] Meer language models proberen
|
9 |
-
- [ ] Elasticsearch
|
10 |
-
- [ ] CLI voor vragen beantwoorden
|
11 |
|
12 |
-
|
|
|
13 |
|
14 |
-
|
15 |
-
- [ ] Question generation voor finetuning
|
16 |
-
- [ ] Language model finetunen
|
17 |
|
18 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
19 |
|
20 |
-
|
21 |
-
- [ ] Proof of concept met UnifiedQA
|
22 |
-
- [ ] Standaard QA model met de dataset
|
23 |
-
- [ ] Papers verzamelen/lezen
|
24 |
-
- [ ] Eerder werk bekijken, inspiratie opdoen voor research richting
|
25 |
|
26 |
-
|
|
|
|
|
27 |
|
28 |
-
|
29 |
|
30 |
-
|
31 |
-
|
32 |
-
|
33 |
-
|
34 |
-
- Voor **extractive QA** gebruik je een reader;
|
35 |
-
- Voor **generative QA** gebruik je een generator.
|
36 |
|
37 |
-
|
38 |
|
39 |
-
|
40 |
|
41 |
-
|
42 |
-
- Overview van open-domain question answering technieken: <https://lilianweng.github.io/posts/2020-10-29-odqa/>
|
43 |
|
44 |
-
|
|
|
|
|
|
|
|
|
45 |
|
46 |
-
|
47 |
-
ophaalt. Haalt voor veel vragen wel hoge similarity scores, maar de documents
|
48 |
-
die die ophaalt zijn meestal niet erg relevant.
|
49 |
|
50 |
-
```
|
51 |
-
|
52 |
-
cd base_model
|
53 |
-
poetry run python main.py
|
54 |
```
|
55 |
|
56 |
-
|
57 |
-
|
58 |
-
|
59 |
-
|
60 |
-
|
61 |
-
|
62 |
-
|
63 |
-
|
64 |
-
|
65 |
-
|
66 |
-
|
67 |
-
|
68 |
-
|
69 |
-
|
70 |
-
|
71 |
-
|
72 |
-
|
73 |
-
|
74 |
-
|
75 |
-
|
76 |
-
|
77 |
-
|
78 |
-
|
79 |
-
>
|
80 |
-
> Result 4 (score: 74.95):
|
81 |
-
> Translating from languages with extensive pro-drop, like Chinese or Japanese,
|
82 |
-
> to non-pro-drop languages like English can be difficult since the model must
|
83 |
-
> somehow identify each zero and recover who or what is being talked about in
|
84 |
-
> order to insert the proper pronoun.
|
85 |
-
>
|
86 |
-
> Result 5 (score: 76.22):
|
87 |
-
> Similarly, a recent challenge set, the WinoMT dataset (Stanovsky et al., 2019)
|
88 |
-
> shows that MT systems perform worse when they are asked to translate sentences
|
89 |
-
> that describe people with non-stereotypical gender roles, like "The doctor
|
90 |
-
> asked the nurse to help her in the > operation".
|
91 |
-
|
92 |
-
|
93 |
-
## Setting up elastic search.
|
|
|
1 |
+
# NLP FlashCards
|
2 |
|
3 |
+
## Dependencies
|
4 |
|
5 |
+
Make sure you have the following tools installed:
|
|
|
|
|
|
|
|
|
|
|
6 |
|
7 |
+
- [Poetry](https://python-poetry.org/) for Python package management;
|
8 |
+
- [Docker](https://www.docker.com/get-started/) for running ElasticSearch.
|
9 |
|
10 |
+
Then, run the following commands:
|
|
|
|
|
11 |
|
12 |
+
```sh
|
13 |
+
poetry install
|
14 |
+
docker pull docker.elastic.co/elasticsearch/elasticsearch:8.1.1
|
15 |
+
docker network create elastic
|
16 |
+
docker run --name es01 --net elastic -p 9200:9200 -p 9300:9300 -it docker.elastic.co/elasticsearch/elasticsearch:8.1.1
|
17 |
+
```
|
18 |
+
|
19 |
+
After the last command, a password for the `elastic` user should show up in the
|
20 |
+
terminal output (you might have to scroll up a bit). Copy this password, and
|
21 |
+
create a copy of the `.env.example` file and rename it to `.env`. Replace the
|
22 |
+
`<password>` placeholder with your copied password.
|
23 |
|
24 |
+
Next, run the following command **from the root of the repository**:
|
|
|
|
|
|
|
|
|
25 |
|
26 |
+
```sh
|
27 |
+
docker cp es01:/usr/share/elasticsearch/config/certs/http_ca.crt .
|
28 |
+
```
|
29 |
|
30 |
+
## Running
|
31 |
|
32 |
+
To make sure we're using the dependencies managed by Poetry, run `poetry shell`
|
33 |
+
before executing any of the following commands. Alternatively, replace any call
|
34 |
+
like `python file.py` with `poetry run python file.py` (but we suggest the shell
|
35 |
+
option, since it is much more convenient).
|
|
|
|
|
36 |
|
37 |
+
### Training
|
38 |
|
39 |
+
N/A for now
|
40 |
|
41 |
+
### Using the QA system
|
|
|
42 |
|
43 |
+
⚠️ **Important** ⚠️ _If you want to run an ElasticSearch query, make sure the
|
44 |
+
docker container is running! You can check this by running `docker container
|
45 |
+
ls`. If your container shows up (it's named `es01` if you followed these
|
46 |
+
instructions), it's running. If not, you can run `docker start es01` to start
|
47 |
+
it, or start it from Docker Desktop._
|
48 |
|
49 |
+
To query the QA system, run any query as follows:
|
|
|
|
|
50 |
|
51 |
+
```sh
|
52 |
+
python query.py "Why can dot product be used as a similarity metric?"
|
|
|
|
|
53 |
```
|
54 |
|
55 |
+
By default, the best answer along with its location in the book will be
|
56 |
+
returned. If you want to generate more answers (say, a top-5), you can supply
|
57 |
+
the `--top=5` option. The default retriever uses [FAISS](https://faiss.ai/), but
|
58 |
+
you can also use [ElasticSearch](https://www.elastic.co/elastic-stack/) using
|
59 |
+
the `--retriever=es` option.
|
60 |
+
|
61 |
+
### CLI overview
|
62 |
+
|
63 |
+
To get an overview of all available options, run `python query.py --help`. The
|
64 |
+
options are also printed below.
|
65 |
+
|
66 |
+
```sh
|
67 |
+
usage: query.py [-h] [--top int] [--retriever {faiss,es}] str
|
68 |
+
|
69 |
+
positional arguments:
|
70 |
+
str The question to feed to the QA system
|
71 |
+
|
72 |
+
options:
|
73 |
+
-h, --help show this help message and exit
|
74 |
+
--top int, -t int The number of answers to retrieve
|
75 |
+
--retriever {faiss,es}, -r {faiss,es}
|
76 |
+
The retrieval method to use
|
77 |
+
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
README.old.md
ADDED
@@ -0,0 +1,93 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# nlp-flashcard-project
|
2 |
+
|
3 |
+
## Todo 2
|
4 |
+
|
5 |
+
- [ ] Contexts preprocessing
|
6 |
+
- [ ] Formules enzo eruit filteren
|
7 |
+
- [ ] Splitsen op zinnen...?
|
8 |
+
- [ ] Meer language models proberen
|
9 |
+
- [ ] Elasticsearch
|
10 |
+
- [ ] CLI voor vragen beantwoorden
|
11 |
+
|
12 |
+
### Extra dingen
|
13 |
+
|
14 |
+
- [ ] Huggingface spaces demo
|
15 |
+
- [ ] Question generation voor finetuning
|
16 |
+
- [ ] Language model finetunen
|
17 |
+
|
18 |
+
## Todo voor progress meeting
|
19 |
+
|
20 |
+
- [ ] Data inlezen/Repo klaarmaken
|
21 |
+
- [ ] Proof of concept met UnifiedQA
|
22 |
+
- [ ] Standaard QA model met de dataset
|
23 |
+
- [ ] Papers verzamelen/lezen
|
24 |
+
- [ ] Eerder werk bekijken, inspiratie opdoen voor research richting
|
25 |
+
|
26 |
+
## Overview
|
27 |
+
|
28 |
+
De meeste QA systemen bestaan uit twee onderdelen:
|
29 |
+
|
30 |
+
- Een retriever. Die haalt adhv de vraag _k_ relevante stukken context op, bv.
|
31 |
+
met `tf-idf`.
|
32 |
+
- Een model dat het antwoord genereert. Wat je hier precies gebruikt hangt af
|
33 |
+
van de manier van question answering:
|
34 |
+
- Voor **extractive QA** gebruik je een reader;
|
35 |
+
- Voor **generative QA** gebruik je een generator.
|
36 |
+
|
37 |
+
Beide werken op basis van een language model.
|
38 |
+
|
39 |
+
## Handige info
|
40 |
+
|
41 |
+
- Huggingface QA tutorial: <https://huggingface.co/docs/transformers/tasks/question_answering#finetune-with-tensorflow>
|
42 |
+
- Overview van open-domain question answering technieken: <https://lilianweng.github.io/posts/2020-10-29-odqa/>
|
43 |
+
|
44 |
+
## Base model
|
45 |
+
|
46 |
+
Tot nu toe alleen een retriever die adhv een vraag de top-k relevante documents
|
47 |
+
ophaalt. Haalt voor veel vragen wel hoge similarity scores, maar de documents
|
48 |
+
die die ophaalt zijn meestal niet erg relevant.
|
49 |
+
|
50 |
+
```bash
|
51 |
+
poetry shell
|
52 |
+
cd base_model
|
53 |
+
poetry run python main.py
|
54 |
+
```
|
55 |
+
|
56 |
+
### Voorbeeld
|
57 |
+
|
58 |
+
"What is the perplexity of a language model?"
|
59 |
+
|
60 |
+
> Result 1 (score: 74.10):
|
61 |
+
> Figure 10 .17 A sample alignment between sentences in English and French, with
|
62 |
+
> sentences extracted from Antoine de Saint-Exupery's Le Petit Prince and a
|
63 |
+
> hypothetical translation. Sentence alignment takes sentences e 1 , ..., e n ,
|
64 |
+
> and f 1 , ..., f n and finds minimal > sets of sentences that are translations
|
65 |
+
> of each other, including single sentence mappings like (e 1 ,f 1 ), (e 4 -f 3
|
66 |
+
> ), (e 5 -f 4 ), (e 6 -f 6 ) as well as 2-1 alignments (e 2 /e 3 ,f 2 ), (e 7
|
67 |
+
> /e 8 -f 7 ), and null alignments (f 5 ).
|
68 |
+
>
|
69 |
+
> Result 2 (score: 74.23):
|
70 |
+
> Character or word overlap-based metrics like chrF (or BLEU, or etc.) are
|
71 |
+
> mainly used to compare two systems, with the goal of answering questions like:
|
72 |
+
> did the new algorithm we just invented improve our MT system? To know if the
|
73 |
+
> difference between the chrF scores of two > MT systems is a significant
|
74 |
+
> difference, we use the paired bootstrap test, or the similar randomization
|
75 |
+
> test.
|
76 |
+
>
|
77 |
+
> Result 3 (score: 74.43):
|
78 |
+
> The model thus predicts the class negative for the test sentence.
|
79 |
+
>
|
80 |
+
> Result 4 (score: 74.95):
|
81 |
+
> Translating from languages with extensive pro-drop, like Chinese or Japanese,
|
82 |
+
> to non-pro-drop languages like English can be difficult since the model must
|
83 |
+
> somehow identify each zero and recover who or what is being talked about in
|
84 |
+
> order to insert the proper pronoun.
|
85 |
+
>
|
86 |
+
> Result 5 (score: 76.22):
|
87 |
+
> Similarly, a recent challenge set, the WinoMT dataset (Stanovsky et al., 2019)
|
88 |
+
> shows that MT systems perform worse when they are asked to translate sentences
|
89 |
+
> that describe people with non-stereotypical gender roles, like "The doctor
|
90 |
+
> asked the nurse to help her in the > operation".
|
91 |
+
|
92 |
+
|
93 |
+
## Setting up elastic search.
|