--- language: - vi pipeline_tag: sentence-similarity --- # NghiemAbe/sami-sbert-CT 🐱 Github Repo This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search. I use pretrained model [bkai-foundation-models/vietnamese-bi-encoder](https://huggingface.co/bkai-foundation-models/vietnamese-bi-encoder) and train the model on SAMI dataset. ## Usage (Sentence-Transformers) Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) installed: ``` pip install -U sentence-transformers ``` Then you can use the model like this: ```python from sentence_transformers import SentenceTransformer # INPUT TEXT MUST BE ALREADY WORD-SEGMENTED! sentences = ["Cô ấy là một người vui_tính .", "Cô ấy cười nói suốt cả ngày ."] model = SentenceTransformer('NghiemAbe/sami-sbert-CT') embeddings = model.encode(sentences) print(embeddings) ``` ## Full Model Architecture ``` SentenceTransformer( (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: RobertaModel (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False}) ) ```