datasets: | |
- roneneldan/TinyStories | |
metrics: | |
- babylm | |
Basemodel: roBERTa | |
Configs: | |
Vocab size: 10,000 | |
Hidden size: 512 | |
Max position embeddings: 512 | |
Number of layers: 2 | |
Number of heads: 4 | |
Window size: 256 | |
Intermediate-size: 1024 | |
Results: | |
- Task: glue | |
Score: 57.69 | |
Confidence Interval: [56.75, 58.73] | |
- Task: blimp | |
Score: 59.25 | |
Confidence Interval: [58.78, 59.65] | |