ZurichNLP
/

unsup-simcse-xlm-roberta-base

Feature Extraction

text-embeddings-inference

Inference Endpoints

Model card Files Files and versions Community

unsup-simcse-xlm-roberta-base / README.md

jvamvas's picture

Add citation

fb492d5 over 1 year ago

|

2.14 kB

	---
	language:
	- multilingual
	- af
	- am
	- ar
	- as
	- az
	- be
	- bg
	- bn
	- br
	- bs
	- ca
	- cs
	- cy
	- da
	- de
	- el
	- en
	- eo
	- es
	- et
	- eu
	- fa
	- fi
	- fr
	- fy
	- ga
	- gd
	- gl
	- gu
	- ha
	- he
	- hi
	- hr
	- hu
	- hy
	- id
	- is
	- it
	- ja
	- jv
	- ka
	- kk
	- km
	- kn
	- ko
	- ku
	- ky
	- la
	- lo
	- lt
	- lv
	- mg
	- mk
	- ml
	- mn
	- mr
	- ms
	- my
	- ne
	- nl
	- 'no'
	- om
	- or
	- pa
	- pl
	- ps
	- pt
	- ro
	- ru
	- sa
	- sd
	- si
	- sk
	- sl
	- so
	- sq
	- sr
	- su
	- sv
	- sw
	- ta
	- te
	- th
	- tl
	- tr
	- ug
	- uk
	- ur
	- uz
	- vi
	- xh
	- yi
	- zh
	license: mit
	pipeline_tag: feature-extraction
	---

	[xlm-roberta-base](https://huggingface.co/xlm-roberta-base) fine-tuned for sentence embeddings with [SimCSE](http://dx.doi.org/10.18653/v1/2021.emnlp-main.552) (Gao et al., EMNLP 2021).

	See a similar English model released by Gao et al.: https://huggingface.co/princeton-nlp/unsup-simcse-roberta-base.

	Fine-tuning was done using the [reference implementation of unsupervised SimCSE](https://github.com/princeton-nlp/SimCSE) and the 1M sentences from English Wikipedia released by the authors.
	As a sentence representation, we used the average of the last hidden states (`pooler_type=avg`), which is compatible with Sentence-BERT.

	Fine-tuning command:
	```bash
	python train.py \
	--model_name_or_path xlm-roberta-base \
	--train_file data/wiki1m_for_simcse.txt \
	--output_dir unsup-simcse-xlm-roberta-base \
	--num_train_epochs 1 \
	--per_device_train_batch_size 32 \
	--gradient_accumulation_steps 16 \
	--learning_rate 1e-5 \
	--max_seq_length 128 \
	--pooler_type avg \
	--overwrite_output_dir \
	--temp 0.05 \
	--do_train \
	--fp16 \
	--seed 28852
	```

	## Citation
	```bibtex
	@article{vamvas-sennrich-2023-rsd,
	title={Towards Unsupervised Recognition of Semantic Differences in Related Documents},
	author={Jannis Vamvas and Rico Sennrich},
	year={2023},
	eprint={2305.13303},
	archivePrefix={arXiv},
	primaryClass={cs.CL}
	}
	```