omarmomen commited on
Commit
235895e
1 Parent(s): 59d81d5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -17,4 +17,6 @@ The paper titled "Increasing The Performance of Cognitively Inspired Data-Effici
17
 
18
  This model variant places the parser network ahead of all attention blocks, and increase the number of convolution layers from 4 to 6.
19
 
20
- The model is pretrained on the BabyLM 10M dataset using a custom pretrained RobertaTokenizer (https://huggingface.co/omarmomen/babylm_tokenizer_32k).
 
 
 
17
 
18
  This model variant places the parser network ahead of all attention blocks, and increase the number of convolution layers from 4 to 6.
19
 
20
+ The model is pretrained on the BabyLM 10M dataset using a custom pretrained RobertaTokenizer (https://huggingface.co/omarmomen/babylm_tokenizer_32k).
21
+
22
+ https://arxiv.org/abs/2310.20589