omarmomen
/

structroberta_sx_final

Model card Files Files and versions Community

omarmomen commited on Mar 26

Commit

59d81d5

•

1 Parent(s): abf5c11

Update README.md

Files changed (1) hide show

README.md +20 -1

README.md CHANGED Viewed

	@@ -1 +1,20 @@
1	- ~~Submitted Model to BabyLM Shared Task~~

+---
+license: mit
+datasets:
+- omarmomen/babylm_10M
+language:
+- en
+metrics:
+- perplexity
+library_name: transformers
+---
+# Model Card for omarmomen/structroberta_sx_final
+This model is part of the experiments in the published paper at the BabyLM workshop in CoNLL 2023.
+The paper titled "Increasing The Performance of Cognitively Inspired Data-Efficient Language Models via Implicit Structure Building" (https://aclanthology.org/2023.conll-babylm.29/)
+<strong>omarmomen/structroberta_sx_final</strong> is a modification on the Roberta Model to incorporate syntactic inductive bias using an unsupervised parsing mechanism.
+This model variant places the parser network ahead of all attention blocks, and increase the number of convolution layers from 4 to 6.
+The model is pretrained on the BabyLM 10M dataset using a custom pretrained RobertaTokenizer (https://huggingface.co/omarmomen/babylm_tokenizer_32k).