Sihangli
/

3D-MoLM

Sihangli commited on Feb 26

Commit

25f7c24

•

1 Parent(s): 2ccaa86

Upload 6 files

Files changed (6) hide show

scibert_scivocab_uncased/.gitattributes ADDED Viewed

+*.bin.* filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tar.gz filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text

scibert_scivocab_uncased/README.md ADDED Viewed

+---
+language: en
+---
+# SciBERT
+This is the pretrained model presented in [SciBERT: A Pretrained Language Model for Scientific Text](https://www.aclweb.org/anthology/D19-1371/), which is a BERT model trained on scientific text.
+The training corpus was papers taken from [Semantic Scholar](https://www.semanticscholar.org). Corpus size is 1.14M papers, 3.1B tokens. We use the full text of the papers in training, not just abstracts.
+SciBERT has its own wordpiece vocabulary (scivocab) that's built to best match the training corpus. We trained cased and uncased versions.
+Available models include:
+* `scibert_scivocab_cased`
+* `scibert_scivocab_uncased`
+The original repo can be found [here](https://github.com/allenai/scibert).
+If using these models, please cite the following paper:
+```
+@inproceedings{beltagy-etal-2019-scibert,
+    title = "SciBERT: A Pretrained Language Model for Scientific Text",
+    author = "Beltagy, Iz  and Lo, Kyle  and Cohan, Arman",
+    booktitle = "EMNLP",
+    year = "2019",
+    publisher = "Association for Computational Linguistics",
+    url = "https://www.aclweb.org/anthology/D19-1371"
+}
+```

scibert_scivocab_uncased/config.json ADDED Viewed

+{
+  "attention_probs_dropout_prob": 0.1,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 768,
+  "initializer_range": 0.02,
+  "intermediate_size": 3072,
+  "layer_norm_eps": 1e-12,
+  "max_position_embeddings": 512,
+  "model_type": "bert",
+  "num_attention_heads": 12,
+  "num_hidden_layers": 12,
+  "pad_token_id": 0,
+  "type_vocab_size": 2,
+  "vocab_size": 31090
+}

scibert_scivocab_uncased/flax_model.msgpack ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:53d32c1d93bebe3fbc0a20e081d8575defc8d481989f97fb82c0f95f3b38f2c1
+size 439681005

scibert_scivocab_uncased/pytorch_model.bin ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:e492944d88ac97dee6baa547671d3c526a3d067676883efb058311f4e5882e1a
+size 442221694

scibert_scivocab_uncased/vocab.txt ADDED Viewed

The diff for this file is too large to render. See raw diff