File size: 1,225 Bytes
8b56581
 
6b25043
8b56581
 
76d971f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
---
datasets:
- nilq/babylm-10M
language:
- en
---

- GPT-2 model submitted by team CLAUSE Bielefeld to the BabyLM challenge 2023
- implements a very naive curriculum learning approach inspired by usage-based linguistics: training examples are ordered according to complexity measures from research on child-directed speech (please consult paper for more info)

Citation:
```
@inproceedings{bunzeck-zarriess-2023-gpt,
    title = "{GPT}-wee: How Small Can a Small Language Model Really Get?",
    author = "Bunzeck, Bastian  and
      Zarrie{\ss}, Sina",
    editor = "Warstadt, Alex  and
      Mueller, Aaron  and
      Choshen, Leshem  and
      Wilcox, Ethan  and
      Zhuang, Chengxu  and
      Ciro, Juan  and
      Mosquera, Rafael  and
      Paranjabe, Bhargavi  and
      Williams, Adina  and
      Linzen, Tal  and
      Cotterell, Ryan",
    booktitle = "Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning",
    month = dec,
    year = "2023",
    address = "Singapore",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.conll-babylm.2",
    doi = "10.18653/v1/2023.conll-babylm.2",
    pages = "35--46",
}

```