--- base_model: sentence-transformers/all-MiniLM-L6-v2 library_name: sentence-transformers pipeline_tag: sentence-similarity tags: - sentence-transformers - sentence-similarity - feature-extraction - generated_from_trainer - dataset_size:723 - loss:ContrastiveTensionLossInBatchNegatives widget: - source_sentence: During a rejected takeoff the aircraft departs runway. sentences: - The ship is approaching shallow water. - During a rejected takeoff the aircraft departs runway. - A/C must maintain minimum safe altitude limits. - source_sentence: ACS must provide attitude maneuver commands when ASTRO-H is rotating. sentences: - Laboratories that handle energetic materials must have laboratory environmental control equipment. - Loss of functioning democratic society (e.g. loss of freedom, human right .etc.). - ACS must provide attitude maneuver commands when ASTRO-H is rotating. - source_sentence: Overpressurization of plant equipment. sentences: - Low fuel level after missed approaches. - Overpressurization of plant equipment. - Aircraft comes too close to service equipment components during operations on the ground. - source_sentence: All the safety/mission critical military aerospace designs and products shall be Certified to allow use or operation. sentences: - Brake light command must illuminate early within X-seconds before stopping vehicle. - All the safety/mission critical military aerospace designs and products shall be Certified to allow use or operation. - ASTRO-H unable to collect scientific data. - source_sentence: Laboratory equipment out of calibration standards. sentences: - Certification Authority personal, including Organization Designation Authorization (ODA), unqualified to the product in analyse or at the certification process. - Staff suffers injury (radiological or physical). - Laboratory equipment out of calibration standards. --- # SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2 This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. ## Model Details ### Model Description - **Model Type:** Sentence Transformer - **Base model:** [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) - **Maximum Sequence Length:** 256 tokens - **Output Dimensionality:** 384 tokens - **Similarity Function:** Cosine Similarity ### Model Sources - **Documentation:** [Sentence Transformers Documentation](https://sbert.net) - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) ### Full Model Architecture ``` SentenceTransformer( (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) (2): Normalize() ) ``` ## Usage ### Direct Usage (Sentence Transformers) First install the Sentence Transformers library: ```bash pip install -U sentence-transformers ``` Then you can load this model and run inference. ```python from sentence_transformers import SentenceTransformer # Download from the 🤗 Hub model = SentenceTransformer("sentence_transformers_model_id") # Run inference sentences = [ 'Laboratory equipment out of calibration standards.', 'Laboratory equipment out of calibration standards.', 'Staff suffers injury (radiological or physical).', ] embeddings = model.encode(sentences) print(embeddings.shape) # [3, 384] # Get the similarity scores for the embeddings similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] ``` ## Training Details ### Training Dataset #### Unnamed Dataset * Size: 723 training samples * Columns: sentence1, sentence2, and label * Approximate statistics based on the first 723 samples: | | sentence1 | sentence2 | label | |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:-----------------------------| | type | string | string | int | | details | | | | * Samples: | sentence1 | sentence2 | label | |:-------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------|:---------------| | Non-patient injured or killed due to radiation. | Non-patient injured or killed due to radiation. | 0 | | Loss of human life / damage to health and wellbeing (e.g. long term concerns with COVID). | Loss of human life / damage to health and wellbeing (e.g. long term concerns with COVID). | 0 | | The aircraft have insufficient power available. | The aircraft have insufficient power available. | 0 | * Loss: [ContrastiveTensionLossInBatchNegatives](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#contrastivetensionlossinbatchnegatives) ### Framework Versions - Python: 3.10.12 - Sentence Transformers: 3.2.1 - Transformers: 4.45.2 - PyTorch: 2.5.0+cu121 - Accelerate: 1.0.1 - Datasets: 3.0.2 - Tokenizers: 0.20.1 ## Citation ### BibTeX #### Sentence Transformers ```bibtex @inproceedings{reimers-2019-sentence-bert, title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", author = "Reimers, Nils and Gurevych, Iryna", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", month = "11", year = "2019", publisher = "Association for Computational Linguistics", url = "https://arxiv.org/abs/1908.10084", } ``` #### ContrastiveTensionLossInBatchNegatives ```bibtex @inproceedings{carlsson2021semantic, title={Semantic Re-tuning with Contrastive Tension}, author={Fredrik Carlsson and Amaru Cuba Gyllensten and Evangelia Gogoulou and Erik Ylip{"a}{"a} Hellqvist and Magnus Sahlgren}, booktitle={International Conference on Learning Representations}, year={2021}, url={https://openreview.net/forum?id=Ov_sMNau-PF} } ```