license: apache-2.0
mutual information Contrastive Sentence Embedding (miCSE):
Language model of the pre-print arXiv paper titled: "miCSE: Mutual Information Contrastive Learning for Low-shot Sentence Embeddings"
The miCSE language model is trained for sentence similarity computation. Training the model imposes alignment between the attention pattern of different views (embeddings of augmentations) during contrastive learning. Learning sentence embeddings with miCSE entails enforcing the syntactic consistency across augmented views for every single sentence, making contrastive self-supervised learning more sample efficient. Sentence representations correspond to the embedding of the [CLS] token.
tokenizer = AutoTokenizer.from_pretrained("sap-ai-research/<----Enter Model Name---->")
model = AutoModelWithLMHead.from_pretrained("sap-ai-research/<----Enter Model Name---->")
Model results on SentEval Benchmark:
| STS12 | STS13 | STS14 | STS15 | STS16 | STSBenchmark | SICKRelatedness | S.Avg. |
| 71.71 | 83.09 | 75.46 | 83.13 | 80.22 | 79.70 | 73.62 | 78.13 |
If you use this code in your research or want to refer to our work, please cite:
title={miCSE: Mutual Information Contrastive Learning for Low-shot Sentence Embeddings},
author={Tassilo Klein and Moin Nabi},