Update README.md
Browse files
README.md
CHANGED
@@ -98,6 +98,8 @@ Masked Language Modeling objective with 15% masked token ratio.
|
|
98 |
### Preprocessing
|
99 |
|
100 |
Tokenize `data["train"]["fen"]` with max-length padding to 200 tokens with default `distilbert-base-cased` tokenizer.
|
|
|
|
|
101 |
Experiments with reduced max-length in tokenization show performance gains.
|
102 |
|
103 |
### Speeds, Sizes, Times
|
|
|
98 |
### Preprocessing
|
99 |
|
100 |
Tokenize `data["train"]["fen"]` with max-length padding to 200 tokens with default `distilbert-base-cased` tokenizer.
|
101 |
+
Inefficient: Most of the vocab is never observed in FEN, wasting embedding parameters.
|
102 |
+
The sequence length / pos embedding size of model and sequence length of data preprocessing leads to lots of padding and wasted parameters. FENs should be shorter than 90 characters.
|
103 |
Experiments with reduced max-length in tokenization show performance gains.
|
104 |
|
105 |
### Speeds, Sizes, Times
|