---
base_model: all-MiniLM-L6-v2
library_name: sentence-transformers
license: apache-2.0
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- ontology
- on2vec
- graph-neural-networks
- base-all-MiniLM-L6-v2
- general
- general-ontology
- fusion-cross_attention
- gnn-gcn
- small-ontology
---

# chiro_all-MiniLM-L6-v2_cross_attention_gcn_h512_o64_cosine_e128_early

This is a sentence-transformers model created with [on2vec](https://github.com/david4096/on2vec), which augments text embeddings with ontological knowledge using Graph Neural Networks.

## Model Details

- **Base Text Model**: all-MiniLM-L6-v2
  - Text Embedding Dimension: 384
- **Ontology**: chiro.owl
- **Domain**: general
- **Ontology Concepts**: 26
- **Concept Alignment**: 26/26 (100.0%)
- **Fusion Method**: cross_attention
- **GNN Architecture**: GCN
- **Structural Embedding Dimension**: 26
- **Output Embedding Dimension**: 64
- **Hidden Dimensions**: 512
- **Dropout**: 0.0
- **Training Date**: 2025-09-19
- **on2vec Version**: 0.1.0
- **Source Ontology Size**: 0.2 MB
- **Model Size**: 91.2 MB
- **Library**: on2vec + sentence-transformers

## Technical Architecture

This model uses a multi-stage architecture:

1. **Text Encoding**: Input text is encoded using the base sentence-transformer model
2. **Ontological Embedding**: Pre-trained GNN embeddings capture structural relationships
3. **Fusion Layer**: Simple concatenation of text and ontological embeddings

**Embedding Flow:**
- Text: 384 dimensions → 512 hidden → 64 output
- Structure: 26 concepts → GNN → 64 output
- Fusion: cross_attention → Final embedding

## How It Works

This model combines:
1. **Text Embeddings**: Generated using the base sentence-transformer model
2. **Ontological Embeddings**: Created by training Graph Neural Networks on OWL ontology structure
3. **Fusion Layer**: Combines both embedding types using the specified fusion method

The ontological knowledge helps the model better understand domain-specific relationships and concepts.

## Usage

```python
from sentence_transformers import SentenceTransformer

# Load the model
model = SentenceTransformer('chiro_all-MiniLM-L6-v2_cross_attention_gcn_h512_o64_cosine_e128_early')

# Generate embeddings
sentences = ['Example sentence 1', 'Example sentence 2']
embeddings = model.encode(sentences)

# Compute similarity
from sentence_transformers.util import cos_sim
similarity = cos_sim(embeddings[0], embeddings[1])
```

## Training Process

This model was created using the on2vec pipeline:

1. **Ontology Processing**: The OWL ontology was converted to a graph structure
2. **GNN Training**: Graph Neural Networks were trained to learn ontological relationships
3. **Text Integration**: Base model text embeddings were combined with ontological embeddings
4. **Fusion Training**: The fusion layer was trained to optimally combine both embedding types

## Intended Use

This model is particularly effective for:
- General domain text processing
- Tasks requiring understanding of domain-specific relationships
- Semantic similarity in specialized domains
- Classification tasks with domain knowledge requirements

## Limitations

- Performance may vary on domains different from the training ontology
- Ontological knowledge is limited to concepts present in the source OWL file
- May have higher computational requirements than vanilla text models

## Citation

If you use this model, please cite the on2vec framework:

```bibtex
@software{on2vec,
  title={on2vec: Ontology Embeddings with Graph Neural Networks},
  author={David Steinberg},
  url={https://github.com/david4096/on2vec},
  year={2024}
}
```

---

Created with [on2vec](https://github.com/david4096/on2vec) 🧬→🤖