---
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:234000
- loss:MSELoss
base_model: google-bert/bert-base-multilingual-uncased
widget:
- source_sentence: who sings in spite of ourselves with john prine
sentences:
- es
- når ble michael jordan draftet til nba
- quien canta en spite of ourselves con john prine
- source_sentence: who wrote when you look me in the eyes
sentences:
- متى بدأت الفتاة الكشفية في بيع ملفات تعريف الارتباط
- A écrit when you look me in the eyes
- fr
- source_sentence: when was fathers day made a national holiday
sentences:
- wann wurde der Vatertag zum nationalen Feiertag
- de
- ' អ្នកណាច្រៀង i want to sing you a love song'
- source_sentence: what is the density of the continental crust
sentences:
- cuál es la densidad de la corteza continental
- wie zingt i want to sing you a love song
- es
- source_sentence: who wrote the song i shot the sheriff
sentences:
- Quel est l'âge légal pour consommer du vin au Canada?
- i shot the sheriff şarkısını kim besteledi
- tr
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- negative_mse
model-index:
- name: SentenceTransformer based on google-bert/bert-base-multilingual-uncased
results:
- task:
type: knowledge-distillation
name: Knowledge Distillation
dataset:
name: MSE val en to ar
type: MSE-val-en-to-ar
metrics:
- type: negative_mse
value: -20.37721574306488
name: Negative Mse
- task:
type: knowledge-distillation
name: Knowledge Distillation
dataset:
name: MSE val en to da
type: MSE-val-en-to-da
metrics:
- type: negative_mse
value: -17.167489230632782
name: Negative Mse
- task:
type: knowledge-distillation
name: Knowledge Distillation
dataset:
name: MSE val en to de
type: MSE-val-en-to-de
metrics:
- type: negative_mse
value: -17.10948944091797
name: Negative Mse
- task:
type: knowledge-distillation
name: Knowledge Distillation
dataset:
name: MSE val en to en
type: MSE-val-en-to-en
metrics:
- type: negative_mse
value: -15.333698689937592
name: Negative Mse
- task:
type: knowledge-distillation
name: Knowledge Distillation
dataset:
name: MSE val en to es
type: MSE-val-en-to-es
metrics:
- type: negative_mse
value: -16.898061335086823
name: Negative Mse
- task:
type: knowledge-distillation
name: Knowledge Distillation
dataset:
name: MSE val en to fi
type: MSE-val-en-to-fi
metrics:
- type: negative_mse
value: -18.428558111190796
name: Negative Mse
- task:
type: knowledge-distillation
name: Knowledge Distillation
dataset:
name: MSE val en to fr
type: MSE-val-en-to-fr
metrics:
- type: negative_mse
value: -17.04207956790924
name: Negative Mse
- task:
type: knowledge-distillation
name: Knowledge Distillation
dataset:
name: MSE val en to he
type: MSE-val-en-to-he
metrics:
- type: negative_mse
value: -19.942057132720947
name: Negative Mse
- task:
type: knowledge-distillation
name: Knowledge Distillation
dataset:
name: MSE val en to hu
type: MSE-val-en-to-hu
metrics:
- type: negative_mse
value: -18.757066130638123
name: Negative Mse
- task:
type: knowledge-distillation
name: Knowledge Distillation
dataset:
name: MSE val en to it
type: MSE-val-en-to-it
metrics:
- type: negative_mse
value: -17.18708872795105
name: Negative Mse
- task:
type: knowledge-distillation
name: Knowledge Distillation
dataset:
name: MSE val en to ja
type: MSE-val-en-to-ja
metrics:
- type: negative_mse
value: -19.915536046028137
name: Negative Mse
- task:
type: knowledge-distillation
name: Knowledge Distillation
dataset:
name: MSE val en to ko
type: MSE-val-en-to-ko
metrics:
- type: negative_mse
value: -21.39919400215149
name: Negative Mse
- task:
type: knowledge-distillation
name: Knowledge Distillation
dataset:
name: MSE val en to km
type: MSE-val-en-to-km
metrics:
- type: negative_mse
value: -28.658682107925415
name: Negative Mse
- task:
type: knowledge-distillation
name: Knowledge Distillation
dataset:
name: MSE val en to ms
type: MSE-val-en-to-ms
metrics:
- type: negative_mse
value: -17.25209951400757
name: Negative Mse
- task:
type: knowledge-distillation
name: Knowledge Distillation
dataset:
name: MSE val en to nl
type: MSE-val-en-to-nl
metrics:
- type: negative_mse
value: -16.605134308338165
name: Negative Mse
- task:
type: knowledge-distillation
name: Knowledge Distillation
dataset:
name: MSE val en to no
type: MSE-val-en-to-no
metrics:
- type: negative_mse
value: -17.149969935417175
name: Negative Mse
- task:
type: knowledge-distillation
name: Knowledge Distillation
dataset:
name: MSE val en to pl
type: MSE-val-en-to-pl
metrics:
- type: negative_mse
value: -17.846450209617615
name: Negative Mse
- task:
type: knowledge-distillation
name: Knowledge Distillation
dataset:
name: MSE val en to pt
type: MSE-val-en-to-pt
metrics:
- type: negative_mse
value: -17.19353199005127
name: Negative Mse
- task:
type: knowledge-distillation
name: Knowledge Distillation
dataset:
name: MSE val en to ru
type: MSE-val-en-to-ru
metrics:
- type: negative_mse
value: -18.13419610261917
name: Negative Mse
- task:
type: knowledge-distillation
name: Knowledge Distillation
dataset:
name: MSE val en to sv
type: MSE-val-en-to-sv
metrics:
- type: negative_mse
value: -17.13200956583023
name: Negative Mse
- task:
type: knowledge-distillation
name: Knowledge Distillation
dataset:
name: MSE val en to th
type: MSE-val-en-to-th
metrics:
- type: negative_mse
value: -26.43084228038788
name: Negative Mse
- task:
type: knowledge-distillation
name: Knowledge Distillation
dataset:
name: MSE val en to tr
type: MSE-val-en-to-tr
metrics:
- type: negative_mse
value: -18.183308839797974
name: Negative Mse
- task:
type: knowledge-distillation
name: Knowledge Distillation
dataset:
name: MSE val en to vi
type: MSE-val-en-to-vi
metrics:
- type: negative_mse
value: -18.749597668647766
name: Negative Mse
- task:
type: knowledge-distillation
name: Knowledge Distillation
dataset:
name: MSE val en to zh cn
type: MSE-val-en-to-zh_cn
metrics:
- type: negative_mse
value: -18.811793625354767
name: Negative Mse
- task:
type: knowledge-distillation
name: Knowledge Distillation
dataset:
name: MSE val en to zh hk
type: MSE-val-en-to-zh_hk
metrics:
- type: negative_mse
value: -18.54081153869629
name: Negative Mse
- task:
type: knowledge-distillation
name: Knowledge Distillation
dataset:
name: MSE val en to zh tw
type: MSE-val-en-to-zh_tw
metrics:
- type: negative_mse
value: -19.14038509130478
name: Negative Mse
---
# SentenceTransformer based on google-bert/bert-base-multilingual-uncased
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [google-bert/bert-base-multilingual-uncased](https://huggingface.co/google-bert/bert-base-multilingual-uncased). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
## Model Details
### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [google-bert/bert-base-multilingual-uncased](https://huggingface.co/google-bert/bert-base-multilingual-uncased)
- **Maximum Sequence Length:** 128 tokens
- **Output Dimensionality:** 768 dimensions
- **Similarity Function:** Cosine Similarity
### Model Sources
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
### Full Model Architecture
```
SentenceTransformer(
(0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
```
## Usage
### Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
```bash
pip install -U sentence-transformers
```
Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("luanafelbarros/bert-base-multilingual-uncased-matryoshka-mkqa")
# Run inference
sentences = [
'who wrote the song i shot the sheriff',
'i shot the sheriff şarkısını kim besteledi',
'tr',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
```
## Evaluation
### Metrics
#### Knowledge Distillation
* Datasets: `MSE-val-en-to-ar`, `MSE-val-en-to-da`, `MSE-val-en-to-de`, `MSE-val-en-to-en`, `MSE-val-en-to-es`, `MSE-val-en-to-fi`, `MSE-val-en-to-fr`, `MSE-val-en-to-he`, `MSE-val-en-to-hu`, `MSE-val-en-to-it`, `MSE-val-en-to-ja`, `MSE-val-en-to-ko`, `MSE-val-en-to-km`, `MSE-val-en-to-ms`, `MSE-val-en-to-nl`, `MSE-val-en-to-no`, `MSE-val-en-to-pl`, `MSE-val-en-to-pt`, `MSE-val-en-to-ru`, `MSE-val-en-to-sv`, `MSE-val-en-to-th`, `MSE-val-en-to-tr`, `MSE-val-en-to-vi`, `MSE-val-en-to-zh_cn`, `MSE-val-en-to-zh_hk` and `MSE-val-en-to-zh_tw`
* Evaluated with [MSEEvaluator
](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.MSEEvaluator)
| Metric | MSE-val-en-to-ar | MSE-val-en-to-da | MSE-val-en-to-de | MSE-val-en-to-en | MSE-val-en-to-es | MSE-val-en-to-fi | MSE-val-en-to-fr | MSE-val-en-to-he | MSE-val-en-to-hu | MSE-val-en-to-it | MSE-val-en-to-ja | MSE-val-en-to-ko | MSE-val-en-to-km | MSE-val-en-to-ms | MSE-val-en-to-nl | MSE-val-en-to-no | MSE-val-en-to-pl | MSE-val-en-to-pt | MSE-val-en-to-ru | MSE-val-en-to-sv | MSE-val-en-to-th | MSE-val-en-to-tr | MSE-val-en-to-vi | MSE-val-en-to-zh_cn | MSE-val-en-to-zh_hk | MSE-val-en-to-zh_tw |
|:-----------------|:-----------------|:-----------------|:-----------------|:-----------------|:-----------------|:-----------------|:-----------------|:-----------------|:-----------------|:-----------------|:-----------------|:-----------------|:-----------------|:-----------------|:-----------------|:-----------------|:-----------------|:-----------------|:-----------------|:-----------------|:-----------------|:-----------------|:-----------------|:--------------------|:--------------------|:--------------------|
| **negative_mse** | **-20.3772** | **-17.1675** | **-17.1095** | **-15.3337** | **-16.8981** | **-18.4286** | **-17.0421** | **-19.9421** | **-18.7571** | **-17.1871** | **-19.9155** | **-21.3992** | **-28.6587** | **-17.2521** | **-16.6051** | **-17.15** | **-17.8465** | **-17.1935** | **-18.1342** | **-17.132** | **-26.4308** | **-18.1833** | **-18.7496** | **-18.8118** | **-18.5408** | **-19.1404** |
## Training Details
### Training Dataset
#### Unnamed Dataset
* Size: 234,000 training samples
* Columns: english
, non-english
, target
, and label
* Approximate statistics based on the first 1000 samples:
| | english | non-english | target | label |
|:--------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:--------------------------------------------------------------------------------|:-------------------------------------|
| type | string | string | string | list |
| details |
who plays hope on days of our lives
| من الذي يلعب الأمل في أيام حياتنا
| ar
| [0.2171212136745453, 0.5138550996780396, 0.5517176389694214, -1.0655105113983154, 1.5853567123413086, ...]
|
| who plays hope on days of our lives
| hvem spiller hope i Horton-sagaen
| da
| [0.2171212136745453, 0.5138550996780396, 0.5517176389694214, -1.0655105113983154, 1.5853567123413086, ...]
|
| who plays hope on days of our lives
| Wer spielt die Hope in Zeit der Sehnsucht?
| de
| [0.2171212136745453, 0.5138550996780396, 0.5517176389694214, -1.0655105113983154, 1.5853567123413086, ...]
|
* Loss: [MSELoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#mseloss)
### Evaluation Dataset
#### Unnamed Dataset
* Size: 13,000 evaluation samples
* Columns: english
, non-english
, target
, and label
* Approximate statistics based on the first 1000 samples:
| | english | non-english | target | label |
|:--------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:--------------------------------------------------------------------------------|:-------------------------------------|
| type | string | string | string | list |
| details | who played prudence on nanny and the professor
| من لعب دور "prudence" فى "nanny and the professor"
| ar
| [-0.2837616801261902, -0.4943353235721588, 0.020107418298721313, 0.7796109318733215, -0.47365888953208923, ...]
|
| who played prudence on nanny and the professor
| hvem spiller prudence på nanny and the professor
| da
| [-0.2837616801261902, -0.4943353235721588, 0.020107418298721313, 0.7796109318733215, -0.47365888953208923, ...]
|
| who played prudence on nanny and the professor
| Wer spielte Prudence in Nanny and the Professor
| de
| [-0.2837616801261902, -0.4943353235721588, 0.020107418298721313, 0.7796109318733215, -0.47365888953208923, ...]
|
* Loss: [MSELoss
](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#mseloss)
### Training Hyperparameters
#### Non-Default Hyperparameters
- `eval_strategy`: steps
- `per_device_train_batch_size`: 64
- `per_device_eval_batch_size`: 64
- `learning_rate`: 1e-05
- `num_train_epochs`: 1
- `warmup_ratio`: 0.1
- `fp16`: True
#### All Hyperparameters