thatdramebaazguy commited on
Commit
e074167
1 Parent(s): 09894e3

update model card

Browse files
Files changed (1) hide show
  1. README.md +66 -0
README.md ADDED
@@ -0,0 +1,66 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ This is Roberta Base with Domain Adaptive Pretraining on Movie Corpora --> Then trained for the NER task using MIT Movie Dataset --> Then a changed head to do the SQuAD Task. This makes a QA model capable of answering questions in the movie domain, with additional information coming from a different task (NER - Task Transfer).
2
+ https://huggingface.co/thatdramebaazguy/movie-roberta-base was used as the MovieRoberta.
3
+
4
+ ---
5
+ datasets:
6
+ - imdb (Movie corpus for Domain Adaptive Pretraining)
7
+ - cornell_movie_dialogue
8
+ - MIT Movie (NER Dataset)
9
+
10
+ language:
11
+ - English
12
+
13
+ thumbnail:
14
+
15
+ tags:
16
+ - roberta
17
+ - roberta-base
18
+ - question-answering
19
+ - qa
20
+ - movies
21
+
22
+ license: cc-by-4.0
23
+
24
+ ---
25
+ # roberta-base for MLM
26
+
27
+ ```
28
+ model_name = "thatdramebaazguy/movie-roberta-MITmovie-squad"
29
+ pipeline(model=model_name, tokenizer=model_name, revision="v1.0", task="question-answering")
30
+ ```
31
+
32
+ ## Overview
33
+ **Language model:** roberta-base
34
+ **Language:** English
35
+ **Downstream-task:** NER --> QA
36
+ **Training data:** imdb, polarity movie data, cornell_movie_dialogue, 25mlens movie names, MIT Movie, SQuADv1
37
+ **Eval data:** MoviesQA (From https://github.com/ibm-aur-nlp/domain-specific-QA)
38
+ **Infrastructure**: 4x Tesla v100
39
+ **Code:** See [example](https://github.com/adityaarunsinghal/Domain-Adaptation/blob/master/scripts/shell_scripts/movieR_NER_squad.sh)
40
+
41
+ ## Hyperparameters
42
+ ```
43
+ Num examples = 88567
44
+ Num Epochs = 3
45
+ Instantaneous batch size per device = 32
46
+ Total train batch size (w. parallel, distributed & accumulation) = 128
47
+ Gradient Accumulation steps = 1
48
+ Total optimization steps = 119182
49
+ eval_loss = 1.6153
50
+ eval_samples = 20573
51
+ perplexity = 5.0296
52
+ learning_rate=5e-05
53
+ n_gpu = 4
54
+
55
+ ```
56
+ ## Performance
57
+
58
+ ### Eval on MoviesQA
59
+ eval_samples = 10790
60
+ 2021-05-07 21:48:01,204 >> exact_match = 83.0274
61
+ 2021-05-07 21:48:01,204 >> f1 = 90.1615
62
+
63
+ Github Repo:
64
+ - [Domain-Adaptation Project](https://github.com/adityaarunsinghal/Domain-Adaptation/)
65
+
66
+ ---