File size: 431 Bytes
3f0da3d 2acc605 3f0da3d aff0e4d aa68d5f 3f0da3d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
---
license: gpl-3.0
language:
- en
library_name: transformers
---
This model uses the LTG-BERT architecture.
The model was trained on a combination of the BabyLM Dataset, the TinyStories Dataset, and generated data,
in accordance with the rules of the Stric track, and the 100M word budget.
The model was trained with 128 token sequence length
Hyperparameters used and evaluation scores will follow in a subsequent update.
|