gaunernst commited on
Commit
0d85b3c
1 Parent(s): b103adc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +64 -0
README.md CHANGED
@@ -1,3 +1,67 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ datasets:
4
+ - bookcorpus
5
+ - wikipedia
6
+ language:
7
+ - en
8
  ---
9
+
10
+ # BERT L2-H768 (uncased)
11
+
12
+ Mini BERT models from https://arxiv.org/abs/1908.08962 that the HF team didn't convert. The original [conversion script](https://github.com/huggingface/transformers/blob/main/src/transformers/models/bert/convert_bert_original_tf_checkpoint_to_pytorch.py) is used.
13
+
14
+ See the original Google repo: [google-research/bert](https://github.com/google-research/bert)
15
+
16
+ Note: it's not clear if these checkpoints have undergone knowledge distillation.
17
+
18
+ ## Model variants
19
+
20
+ | |H=128|H=256|H=512|H=768|
21
+ |---|:---:|:---:|:---:|:---:|
22
+ | **L=2** |[2/128 (BERT-Tiny)][2_128]|[2/256][2_256]|[2/512][2_512]|[**2/768**][2_768]|
23
+ | **L=4** |[4/128][4_128]|[4/256 (BERT-Mini)][4_256]|[4/512 (BERT-Small)][4_512]|[4/768][4_768]|
24
+ | **L=6** |[6/128][6_128]|[6/256][6_256]|[6/512][6_512]|[6/768][6_768]|
25
+ | **L=8** |[8/128][8_128]|[8/256][8_256]|[8/512 (BERT-Medium)][8_512]|[8/768][8_768]|
26
+ | **L=10** |[10/128][10_128]|[10/256][10_256]|[10/512][10_512]|[10/768][10_768]|
27
+ | **L=12** |[12/128][12_128]|[12/256][12_256]|[12/512][12_512]|[12/768 (BERT-Base, original)][12_768]|
28
+
29
+ [2_128]: https://huggingface.co/gaunernst/bert-tiny-uncased
30
+ [2_256]: https://huggingface.co/gaunernst/bert-L2-H256-uncased
31
+ [2_512]: https://huggingface.co/gaunernst/bert-L2-H512-uncased
32
+ [2_768]: https://huggingface.co/gaunernst/bert-L2-H768-uncased
33
+ [4_128]: https://huggingface.co/gaunernst/bert-L4-H128-uncased
34
+ [4_256]: https://huggingface.co/gaunernst/bert-mini-uncased
35
+ [4_512]: https://huggingface.co/gaunernst/bert-small-uncased
36
+ [4_768]: https://huggingface.co/gaunernst/bert-L4-H768-uncased
37
+ [6_128]: https://huggingface.co/gaunernst/bert-L6-H128-uncased
38
+ [6_256]: https://huggingface.co/gaunernst/bert-L6-H256-uncased
39
+ [6_512]: https://huggingface.co/gaunernst/bert-L6-H512-uncased
40
+ [6_768]: https://huggingface.co/gaunernst/bert-L6-H768-uncased
41
+ [8_128]: https://huggingface.co/gaunernst/bert-L8-H128-uncased
42
+ [8_256]: https://huggingface.co/gaunernst/bert-L8-H256-uncased
43
+ [8_512]: https://huggingface.co/gaunernst/bert-medium-uncased
44
+ [8_768]: https://huggingface.co/gaunernst/bert-L8-H768-uncased
45
+ [10_128]: https://huggingface.co/gaunernst/bert-L10-H128-uncased
46
+ [10_256]: https://huggingface.co/gaunernst/bert-L10-H256-uncased
47
+ [10_512]: https://huggingface.co/gaunernst/bert-L10-H512-uncased
48
+ [10_768]: https://huggingface.co/gaunernst/bert-L10-H768-uncased
49
+ [12_128]: https://huggingface.co/gaunernst/bert-L12-H128-uncased
50
+ [12_256]: https://huggingface.co/gaunernst/bert-L12-H256-uncased
51
+ [12_512]: https://huggingface.co/gaunernst/bert-L12-H512-uncased
52
+ [12_768]: https://huggingface.co/bert-base-uncased
53
+
54
+ ## Usage
55
+
56
+ See other BERT model cards e.g. https://huggingface.co/bert-base-uncased
57
+
58
+ ## Citation
59
+
60
+ ```bibtex
61
+ @article{turc2019,
62
+ title={Well-Read Students Learn Better: On the Importance of Pre-training Compact Models},
63
+ author={Turc, Iulia and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina},
64
+ journal={arXiv preprint arXiv:1908.08962v2 },
65
+ year={2019}
66
+ }
67
+ ```