avemio-digital commited on
Commit
3cc12c9
·
verified ·
1 Parent(s): 72207f1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +161 -10
README.md CHANGED
@@ -16,12 +16,92 @@ datasets:
16
  - avemio-digital/GRAG-Embedding-Triples-Hessian-AI
17
  ---
18
 
19
- # Model Trained Using AutoTrain
20
 
21
- - Problem type: Sentence Transformers
22
 
23
- ## Validation Metrics
24
- No validation metrics available
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
 
26
  ## Usage
27
 
@@ -37,18 +117,89 @@ Then you can load this model and run inference.
37
  ```python
38
  from sentence_transformers import SentenceTransformer
39
 
40
- # Download from the Hugging Face Hub
41
- model = SentenceTransformer("sentence_transformers_model_id")
42
  # Run inference
43
  sentences = [
44
- 'search_query: autotrain',
45
- 'search_query: auto train',
46
- 'search_query: i love autotrain',
47
  ]
48
  embeddings = model.encode(sentences)
49
  print(embeddings.shape)
 
50
 
51
  # Get the similarity scores for the embeddings
52
  similarities = model.similarity(embeddings, embeddings)
53
  print(similarities.shape)
54
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
  - avemio-digital/GRAG-Embedding-Triples-Hessian-AI
17
  ---
18
 
 
19
 
20
+ # GRAG-BGE-M3-TRIPLES-HESSIAN-AI
21
 
22
+ This is a [sentence-transformers](https://www.SBERT.net) model trained on this [Dataset](https://huggingface.co/datasets/avemio/GRAG-Embedding-Triples-Hessian-AI) with roughly 300k Triple-Samples. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
23
+ It was merged with the Base-Model [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3) again to maintain performance on other languages again.
24
+
25
+ ## Model Details
26
+
27
+ ### Model Description
28
+ - **Model Type:** Sentence Transformer
29
+ <!-- - **Base model:** [Unknown](https://huggingface.co/unknown) -->
30
+ - **Maximum Sequence Length:** 8192 tokens
31
+ - **Output Dimensionality:** 1024 tokens
32
+ - **Similarity Function:** Cosine Similarity
33
+ <!-- - **Training Dataset:** Unknown -->
34
+ <!-- - **Language:** Unknown -->
35
+ <!-- - **License:** Unknown -->
36
+
37
+ ### Model Sources
38
+
39
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
40
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
41
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
42
+
43
+ ### Full Model Architecture
44
+
45
+ ```
46
+ SentenceTransformer(
47
+ (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: XLMRobertaModel
48
+ (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
49
+ (2): Normalize()
50
+ )
51
+ ```
52
+
53
+ ## Evaluation MTEB-Tasks
54
+
55
+ ### Classification
56
+ - AmazonCounterfactualClassification
57
+ - AmazonReviewsClassification
58
+ - MassiveIntentClassification
59
+ - MassiveScenarioClassification
60
+ - MTOPDomainClassification
61
+ - MTOPIntentClassification
62
+
63
+ ### Pair Classification
64
+ - FalseFriendsGermanEnglish
65
+ - PawsXPairClassification
66
+
67
+ ### Retrieval
68
+ - GermanQuAD-Retrieval
69
+ - GermanDPR
70
+
71
+ ### STS (Semantic Textual Similarity)
72
+ - GermanSTSBenchmark
73
+
74
+ #### Comparison between Base-Model ([BGE-M3](https://huggingface.co/BAAI/bge-m3)), Finetuned Model ([GRAG-BGE](https://huggingface.co/avemio/GRAG-BGE-M3-TRIPLES-HESSIAN-AI)) and Merged Model with Base-Model ([Merged-BGE](https://huggingface.co/avemio/GRAG-BGE-M3-TRIPLES-MERGED-HESSIAN-AI/))
75
+
76
+ | TASK | [BGE-M3](https://huggingface.co/BAAI/bge-m3) | GRAG-BGE | [Merged-BGE](https://huggingface.co/avemio/GRAG-BGE-M3-TRIPLES-MERGED-HESSIAN-AI/) | GRAG vs. BGE | Merged vs. BGE |
77
+ |-------------------------------------|-------|----------|------------|--------------|----------------|
78
+ | AmazonCounterfactualClassification | 0.6908 | 0.5449 | **0.7111** | -14.59% | 2.03% |
79
+ | AmazonReviewsClassification | **0.4634** | 0.2745 | 0.4571 | -18.89% | -0.63% |
80
+ | FalseFriendsGermanEnglish | **0.5343** | 0.4777 | 0.5338 | -5.67% | -0.05% |
81
+ | GermanQuAD-Retrieval | **0.9444** | 0.8714 | 0.9311 | -7.30% | -1.33% |
82
+ | GermanSTSBenchmark | 0.8079 | 0.7921 | **0.8218** | -1.58% | 1.39% |
83
+ | MassiveIntentClassification | **0.6575** | 0.4884 | 0.6522 | -16.90% | -0.52% |
84
+ | MassiveScenarioClassification | 0.7355 | 0.5837 | **0.7381** | -15.19% | 0.25% |
85
+ | GermanDPR | **0.8265** | 0.7210 | 0.8159 | -10.54% | -1.06% |
86
+ | MTOPDomainClassification | 0.9121 | 0.7450 | **0.9139** | -16.71% | 0.17% |
87
+ | MTOPIntentClassification | **0.6808** | 0.4516 | 0.6684 | -22.92% | -1.25% |
88
+ | PawsXPairClassification | 0.5678 | 0.5077 | **0.5710** | -6.01% | 0.33% |
89
+
90
+ #### Comparison between Base-Model ([BGE-M3](https://huggingface.co/BAAI/bge-m3)), Merged Model with Base-Model ([Merged-BGE](https://huggingface.co/avemio/GRAG-BGE-M3-TRIPLES-MERGED-HESSIAN-AI/)) and our Merged-Model merged with [Snowflake/snowflake-arctic-embed-l-v2.0](https://huggingface.co/Snowflake/snowflake-arctic-embed-l-v2.0)
91
+
92
+ | TASK | [BGE-M3](https://huggingface.co/BAAI/bge-m3) | [Merged-BGE](https://huggingface.co/avemio/GRAG-BGE-M3-TRIPLES-MERGED-HESSIAN-AI/) | [Merged-Snowflake](https://huggingface.co/avemio/GRAG-BGE-M3-MERGED-x-SNOWFLAKE-ARCTIC-HESSIAN-AI/) | Merged-BGE vs. BGE | Merged-Snowflake vs. BGE | Merged-Snowflake vs. Merged-BGE |
93
+ |-------------------------------------|-------|------------|------------------|--------------------|--------------------------|---------------------------------|
94
+ | AmazonCounterfactualClassification | 0.6908 | 0.7111 | **0.7152** | 2.94% | 3.53% | 0.58% |
95
+ | AmazonReviewsClassification | **0.4634** | 0.4571 | 0.4577 | -1.36% | -1.23% | 0.13% |
96
+ | FalseFriendsGermanEnglish | 0.5343 | 0.5338 | **0.5378** | -0.09% | 0.66% | 0.75% |
97
+ | GermanQuAD-Retrieval | 0.9444 | 0.9311 | **0.9456** | -1.41% | 0.13% | 1.56% |
98
+ | GermanSTSBenchmark | 0.8079 | 0.8218 | **0.8558** | 1.72% | 5.93% | 4.14% |
99
+ | MassiveIntentClassification | 0.6575 | 0.6522 | **0.6826** | -0.81% | 3.82% | 4.66% |
100
+ | MassiveScenarioClassification | 0.7355 | 0.7381 | **0.7494** | 0.35% | 1.89% | 1.53% |
101
+ | GermanDPR | 0.8265 | 0.8159 | **0.8330** | -1.28% | 0.79% | 2.10% |
102
+ | MTOPDomainClassification | 0.9121 | 0.9139 | **0.9259** | 0.20% | 1.52% | 1.31% |
103
+ | MTOPIntentClassification | 0.6808 | 0.6684 | **0.7143** | -1.82% | 4.91% | 6.87% |
104
+ | PawsXPairClassification | 0.5678 | 0.5710 | **0.5803** | 0.56% | 2.18% | 1.63% |
105
 
106
  ## Usage
107
 
 
117
  ```python
118
  from sentence_transformers import SentenceTransformer
119
 
120
+ # Download from the 🤗 Hub
121
+ model = SentenceTransformer("avemio/GRAG-BGE-M3-TRIPLES-HESSIAN-AI")
122
  # Run inference
123
  sentences = [
124
+ 'The weather is lovely today.',
125
+ "It's so sunny outside!",
126
+ 'He drove to the stadium.',
127
  ]
128
  embeddings = model.encode(sentences)
129
  print(embeddings.shape)
130
+ # [3, 1024]
131
 
132
  # Get the similarity scores for the embeddings
133
  similarities = model.similarity(embeddings, embeddings)
134
  print(similarities.shape)
135
+ # [3, 3]
136
+ ```
137
+
138
+ <!--
139
+ ### Direct Usage (Transformers)
140
+
141
+ <details><summary>Click to see the direct usage in Transformers</summary>
142
+
143
+ </details>
144
+ -->
145
+
146
+ <!--
147
+ ### Downstream Usage (Sentence Transformers)
148
+
149
+ You can finetune this model on your own dataset.
150
+
151
+ <details><summary>Click to expand</summary>
152
+
153
+ </details>
154
+ -->
155
+
156
+ <!--
157
+ ### Out-of-Scope Use
158
+
159
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
160
+ -->
161
+
162
+ <!--
163
+ ## Bias, Risks and Limitations
164
+
165
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
166
+ -->
167
+
168
+ <!--
169
+ ### Recommendations
170
+
171
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
172
+ -->
173
+
174
+ ## Training Details
175
+
176
+ ### Framework Versions
177
+ - Python: 3.10.12
178
+ - Sentence Transformers: 3.2.1
179
+ - Transformers: 4.44.2
180
+ - PyTorch: 2.4.1+cu121
181
+ - Accelerate: 0.34.2
182
+ - Datasets: 3.0.1
183
+ - Tokenizers: 0.19.1
184
+
185
+ ## Citation
186
+
187
+ ### BibTeX
188
+
189
+ <!--
190
+ ## Glossary
191
+
192
+ *Clearly define terms in order to be accessible across audiences.*
193
+ -->
194
+
195
+ <!--
196
+ ## Model Card Authors
197
+
198
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
199
+ -->
200
+
201
+ <!--
202
+ ## Model Card Contact
203
+
204
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
205
+ -->