Model Card for omarmomen/structroberta_sx_final

This model is part of the experiments in the published paper at the BabyLM workshop in CoNLL 2023. The paper titled "Increasing The Performance of Cognitively Inspired Data-Efficient Language Models via Implicit Structure Building" (https://aclanthology.org/2023.conll-babylm.29/)

omarmomen/structroberta_sx_final is a modification on the Roberta Model to incorporate syntactic inductive bias using an unsupervised parsing mechanism.

This model variant places the parser network ahead of all attention blocks, and increase the number of convolution layers from 4 to 6.

The model is pretrained on the BabyLM 10M dataset using a custom pretrained RobertaTokenizer (https://huggingface.co/omarmomen/babylm_tokenizer_32k).

https://arxiv.org/abs/2310.20589

Downloads last month: 6

Dataset used to train omarmomen/structroberta_sx_final

Paper for omarmomen/structroberta_sx_final

Increasing The Performance of Cognitively Inspired Data-Efficient Language Models via Implicit Structure Building

Paper • 2310.20589 • Published Oct 31, 2023