Dani
commited on
Commit
·
ef9fd15
1
Parent(s):
255b995
fixed to use the MaskedLM version of model
Browse files- README.md +4 -8
- pytorch_model.bin +2 -2
README.md
CHANGED
@@ -4,14 +4,15 @@ license: apache-2.0
|
|
4 |
datasets:
|
5 |
- wikipedia
|
6 |
widget:
|
7 |
-
- text: "El
|
8 |
---
|
9 |
|
10 |
# DistilBERT base multilingual model Spanish subset (cased)
|
11 |
|
12 |
-
This model is the Spanish extract of `distilbert-base-multilingual-cased
|
13 |
|
14 |
-
|
|
|
15 |
|
16 |
```sh
|
17 |
python reduce_model.py \
|
@@ -24,8 +25,3 @@ python reduce_model.py \
|
|
24 |
The resulting model has the same architecture as DistilmBERT: 6 layers, 768 dimension and 12 heads, with a total of **65M parameters** (compared to 134M parameters for DistilmBERT).
|
25 |
|
26 |
The goal of this model is to reduce even further the size of the `distilbert-base-multilingual` multilingual model by selecting only most frequent tokens for Spanish, reducing the size of the embedding layer. For more details visit the paper from the Geotrend team: Load What You Need: Smaller Versions of Multilingual BERT.
|
27 |
-
|
28 |
-
|
29 |
-
|
30 |
-
|
31 |
-
|
|
|
4 |
datasets:
|
5 |
- wikipedia
|
6 |
widget:
|
7 |
+
- text: "El español es un idioma muy [MASK] en el mundo."
|
8 |
---
|
9 |
|
10 |
# DistilBERT base multilingual model Spanish subset (cased)
|
11 |
|
12 |
+
This model is the Spanish extract of `distilbert-base-multilingual-cased` (https://huggingface.co/distilbert-base-multilingual-cased), a distilled version of the [BERT base multilingual model](bert-base-multilingual-cased). This model is cased: it does make a difference between english and English.
|
13 |
|
14 |
+
It uses the extraction method proposed by Geotrend, which is described in https://github.com/Geotrend-research/smaller-transformers.
|
15 |
+
Specifically, we've ran the following script:
|
16 |
|
17 |
```sh
|
18 |
python reduce_model.py \
|
|
|
25 |
The resulting model has the same architecture as DistilmBERT: 6 layers, 768 dimension and 12 heads, with a total of **65M parameters** (compared to 134M parameters for DistilmBERT).
|
26 |
|
27 |
The goal of this model is to reduce even further the size of the `distilbert-base-multilingual` multilingual model by selecting only most frequent tokens for Spanish, reducing the size of the embedding layer. For more details visit the paper from the Geotrend team: Load What You Need: Smaller Versions of Multilingual BERT.
|
|
|
|
|
|
|
|
|
|
pytorch_model.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:0a7e9034002f6027c9c3e2644bf743b008fc7081072839124abd6673e6740c5c
|
3 |
+
size 255139145
|