VMware
/

vinilm-2021-from-large

Feature Extraction

Inference Endpoints

Model card Files Files and versions Community

Teja Gollapudi commited on Jan 18, 2023

Commit

e06b8a8

•

1 Parent(s): 260e8fe

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -22,7 +22,7 @@ license: "apache-2.0"
 </ul>
 #### Motivation
-Based on [MiniLMv2 distillation](https://arxiv.org/pdf/2012.15828.pdf), we have distilled vBERT-2021-large into a smaller minilmv2-type model for faster inference times without a significant loss of performance.
 #### Intended Use
 The model functions as a VMware-specific Language Model.
@@ -53,7 +53,7 @@ output = model(encoded_input)
 ```
 ### Training
 #### - Datasets
 Publically available VMware text data such as VMware Docs, Blogs, etc. were used for distilling the teacher vBERT-2021-large model into vinilm-2021-from-large model. Sourced in May 2021. (~320,000 Documents)
 #### - Preprocessing
@@ -67,7 +67,7 @@ Publically available VMware text data such as VMware Docs, Blogs, etc. were used
 #### - Model performance measures
 We benchmarked vBERT on various VMware-specific NLP downstream tasks (IR, classification, etc).
-The model scored higher than the 'bert-base-uncased' model on all benchmarks.
 ### Limitations and bias
 Since the model is distilled from a vBERT model based on the BERT model, it may have the same biases embedded within the original BERT model.

 </ul>
 #### Motivation
+Based on [MiniLMv2 distillation](https://arxiv.org/pdf/2012.15828.pdf), we have distilled vBERT-2021-large into a smaller minilmv2 model for faster inference times without a significant loss of performance.
 #### Intended Use
 The model functions as a VMware-specific Language Model.
 ```
 ### Training
+The model is distilled from [vBERT-2021-large](https://huggingface.co/VMware/vbert-2021-large). [nreimers/MiniLMv2-L6-H768-distilled-from-BERT-Large](https://huggingface.co/nreimers/MiniLMv2-L6-H768-distilled-from-BERT-Large/tree/main) was used to initialize the weights.
 #### - Datasets
 Publically available VMware text data such as VMware Docs, Blogs, etc. were used for distilling the teacher vBERT-2021-large model into vinilm-2021-from-large model. Sourced in May 2021. (~320,000 Documents)
 #### - Preprocessing
 #### - Model performance measures
 We benchmarked vBERT on various VMware-specific NLP downstream tasks (IR, classification, etc).
+The model scored higher than the 'bert-base-uncased' model on all benchmarks.
 ### Limitations and bias
 Since the model is distilled from a vBERT model based on the BERT model, it may have the same biases embedded within the original BERT model.