brilan commited on
Commit
10a6908
1 Parent(s): 3ee6623

Add new SentenceTransformer model.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,481 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: sentence-transformers/all-mpnet-base-v2
3
+ library_name: sentence-transformers
4
+ metrics:
5
+ - cosine_accuracy
6
+ - dot_accuracy
7
+ - manhattan_accuracy
8
+ - euclidean_accuracy
9
+ - max_accuracy
10
+ pipeline_tag: sentence-similarity
11
+ tags:
12
+ - sentence-transformers
13
+ - sentence-similarity
14
+ - feature-extraction
15
+ - generated_from_trainer
16
+ - dataset_size:6462
17
+ - loss:MultipleNegativesRankingLoss
18
+ widget:
19
+ - source_sentence: gain successful RDP authentication
20
+ sentences:
21
+ - Creates or Schedules a task.
22
+ - Execute processes on other systems complete with full interactivity for console
23
+ applications without having to manually install client software.
24
+ - allows users to execute commands remotely on target systems using various methods
25
+ including WMI, SMB, SSH, RDP, and PowerShell
26
+ - source_sentence: collect and stage the informaiton in AD
27
+ sentences:
28
+ - Displays the directory structure of a path or of the disk in a drive graphically.
29
+ - Get user name and group information along with the respective security identifiers
30
+ (SID) claims privileges logon identifier (logon ID) for the current user on the
31
+ local system.
32
+ - retrieve stored passwords from various software and operating systems
33
+ - source_sentence: Download files or binary for further usage
34
+ sentences:
35
+ - allows users to extract sensitive credential information from the Local Security
36
+ Authority (LSA) on Windows systems.
37
+ - Transfer data from or to a server using URLs.
38
+ - Displays and modifies entries in the Address Resolution Protocol (ARP) cache.
39
+ - source_sentence: collect and stage the informaiton in AD
40
+ sentences:
41
+ - Adds displays or modifies global groups in domains.
42
+ - Gets the local security groups.
43
+ - Displays the directory structure of a path or of the disk in a drive graphically.
44
+ - source_sentence: Modify Registry of Current User Profile
45
+ sentences:
46
+ - Stops one or more running services.
47
+ - Allows users to manage local and domain user accounts.
48
+ - Saves a copy of specified subkeys, entries, and values of the registry in a specified
49
+ file.
50
+ model-index:
51
+ - name: SentenceTransformer based on sentence-transformers/all-mpnet-base-v2
52
+ results:
53
+ - task:
54
+ type: triplet
55
+ name: Triplet
56
+ dataset:
57
+ name: dev
58
+ type: dev
59
+ metrics:
60
+ - type: cosine_accuracy
61
+ value: 1.0
62
+ name: Cosine Accuracy
63
+ - type: dot_accuracy
64
+ value: 0.0
65
+ name: Dot Accuracy
66
+ - type: manhattan_accuracy
67
+ value: 1.0
68
+ name: Manhattan Accuracy
69
+ - type: euclidean_accuracy
70
+ value: 1.0
71
+ name: Euclidean Accuracy
72
+ - type: max_accuracy
73
+ value: 1.0
74
+ name: Max Accuracy
75
+ - task:
76
+ type: triplet
77
+ name: Triplet
78
+ dataset:
79
+ name: test
80
+ type: test
81
+ metrics:
82
+ - type: cosine_accuracy
83
+ value: 1.0
84
+ name: Cosine Accuracy
85
+ - type: dot_accuracy
86
+ value: 0.0
87
+ name: Dot Accuracy
88
+ - type: manhattan_accuracy
89
+ value: 1.0
90
+ name: Manhattan Accuracy
91
+ - type: euclidean_accuracy
92
+ value: 1.0
93
+ name: Euclidean Accuracy
94
+ - type: max_accuracy
95
+ value: 1.0
96
+ name: Max Accuracy
97
+ ---
98
+
99
+ # SentenceTransformer based on sentence-transformers/all-mpnet-base-v2
100
+
101
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
102
+
103
+ ## Model Details
104
+
105
+ ### Model Description
106
+ - **Model Type:** Sentence Transformer
107
+ - **Base model:** [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) <!-- at revision f1b1b820e405bb8644f5e8d9a3b98f9c9e0a3c58 -->
108
+ - **Maximum Sequence Length:** 384 tokens
109
+ - **Output Dimensionality:** 768 tokens
110
+ - **Similarity Function:** Cosine Similarity
111
+ <!-- - **Training Dataset:** Unknown -->
112
+ <!-- - **Language:** Unknown -->
113
+ <!-- - **License:** Unknown -->
114
+
115
+ ### Model Sources
116
+
117
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
118
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
119
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
120
+
121
+ ### Full Model Architecture
122
+
123
+ ```
124
+ SentenceTransformer(
125
+ (0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: MPNetModel
126
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
127
+ (2): Normalize()
128
+ )
129
+ ```
130
+
131
+ ## Usage
132
+
133
+ ### Direct Usage (Sentence Transformers)
134
+
135
+ First install the Sentence Transformers library:
136
+
137
+ ```bash
138
+ pip install -U sentence-transformers
139
+ ```
140
+
141
+ Then you can load this model and run inference.
142
+ ```python
143
+ from sentence_transformers import SentenceTransformer
144
+
145
+ # Download from the 🤗 Hub
146
+ model = SentenceTransformer("brilan/procedure-tool-matching_3_epochs")
147
+ # Run inference
148
+ sentences = [
149
+ 'Modify Registry of Current User Profile',
150
+ 'Saves a copy of specified subkeys, entries, and values of the registry in a specified file.',
151
+ 'Stops one or more running services.',
152
+ ]
153
+ embeddings = model.encode(sentences)
154
+ print(embeddings.shape)
155
+ # [3, 768]
156
+
157
+ # Get the similarity scores for the embeddings
158
+ similarities = model.similarity(embeddings, embeddings)
159
+ print(similarities.shape)
160
+ # [3, 3]
161
+ ```
162
+
163
+ <!--
164
+ ### Direct Usage (Transformers)
165
+
166
+ <details><summary>Click to see the direct usage in Transformers</summary>
167
+
168
+ </details>
169
+ -->
170
+
171
+ <!--
172
+ ### Downstream Usage (Sentence Transformers)
173
+
174
+ You can finetune this model on your own dataset.
175
+
176
+ <details><summary>Click to expand</summary>
177
+
178
+ </details>
179
+ -->
180
+
181
+ <!--
182
+ ### Out-of-Scope Use
183
+
184
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
185
+ -->
186
+
187
+ ## Evaluation
188
+
189
+ ### Metrics
190
+
191
+ #### Triplet
192
+ * Dataset: `dev`
193
+ * Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)
194
+
195
+ | Metric | Value |
196
+ |:--------------------|:--------|
197
+ | **cosine_accuracy** | **1.0** |
198
+ | dot_accuracy | 0.0 |
199
+ | manhattan_accuracy | 1.0 |
200
+ | euclidean_accuracy | 1.0 |
201
+ | max_accuracy | 1.0 |
202
+
203
+ #### Triplet
204
+ * Dataset: `test`
205
+ * Evaluated with [<code>TripletEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.TripletEvaluator)
206
+
207
+ | Metric | Value |
208
+ |:--------------------|:--------|
209
+ | **cosine_accuracy** | **1.0** |
210
+ | dot_accuracy | 0.0 |
211
+ | manhattan_accuracy | 1.0 |
212
+ | euclidean_accuracy | 1.0 |
213
+ | max_accuracy | 1.0 |
214
+
215
+ <!--
216
+ ## Bias, Risks and Limitations
217
+
218
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
219
+ -->
220
+
221
+ <!--
222
+ ### Recommendations
223
+
224
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
225
+ -->
226
+
227
+ ## Training Details
228
+
229
+ ### Training Dataset
230
+
231
+ #### Unnamed Dataset
232
+
233
+
234
+ * Size: 6,462 training samples
235
+ * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
236
+ * Approximate statistics based on the first 1000 samples:
237
+ | | anchor | positive | negative |
238
+ |:--------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
239
+ | type | string | string | string |
240
+ | details | <ul><li>min: 5 tokens</li><li>mean: 9.62 tokens</li><li>max: 17 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 18.14 tokens</li><li>max: 47 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 17.66 tokens</li><li>max: 57 tokens</li></ul> |
241
+ * Samples:
242
+ | anchor | positive | negative |
243
+ |:---------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
244
+ | <code>used compromised domain accounts to gain access to the target environment</code> | <code>allows users to execute commands remotely on target systems using various methods including WMI, SMB, SSH, RDP, and PowerShell</code> | <code>Displays information about user sessions on a Remote Desktop Session Host server.</code> |
245
+ | <code>use default credentials to connect to IPC$ shares on remote machines</code> | <code>Execute commands on remote targets via Remote Desktop Protocol (RDP) without requiring a graphical user interface (GUI). </code> | <code>It provides functionality to view create modify and delete user accounts directly from the command prompt.</code> |
246
+ | <code>gain access to the server via SSH</code> | <code>allow users to connect to RDP servers</code> | <code>allows administrators to manage and configure audit policies for the system and provides the ability to view, set, and modify the audit policies that control what events are logged by the Windows security auditing subsystem.</code> |
247
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
248
+ ```json
249
+ {
250
+ "scale": 20.0,
251
+ "similarity_fct": "cos_sim"
252
+ }
253
+ ```
254
+
255
+ ### Evaluation Dataset
256
+
257
+ #### Unnamed Dataset
258
+
259
+
260
+ * Size: 2,770 evaluation samples
261
+ * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
262
+ * Approximate statistics based on the first 1000 samples:
263
+ | | anchor | positive | negative |
264
+ |:--------|:---------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
265
+ | type | string | string | string |
266
+ | details | <ul><li>min: 5 tokens</li><li>mean: 9.48 tokens</li><li>max: 17 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 18.31 tokens</li><li>max: 47 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 18.21 tokens</li><li>max: 57 tokens</li></ul> |
267
+ * Samples:
268
+ | anchor | positive | negative |
269
+ |:-------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------|
270
+ | <code>Disable Windows Services related to security products</code> | <code>stop running service</code> | <code>Creates lists and deletes stored user names and passwords or credentials.</code> |
271
+ | <code>Get user information</code> | <code>Gets the local security groups.</code> | <code>Copy files from source to dest between local and remote machine skipping identical files. </code> |
272
+ | <code>used pass the hash for lateral movement</code> | <code>Execute processes on other systems complete with full interactivity for console applications without having to manually install client software.</code> | <code>Extracts passwords keys,pin,codes,tickets from the memory of lsass</code> |
273
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
274
+ ```json
275
+ {
276
+ "scale": 20.0,
277
+ "similarity_fct": "cos_sim"
278
+ }
279
+ ```
280
+
281
+ ### Training Hyperparameters
282
+ #### Non-Default Hyperparameters
283
+
284
+ - `eval_strategy`: steps
285
+ - `per_device_train_batch_size`: 16
286
+ - `per_device_eval_batch_size`: 16
287
+ - `warmup_ratio`: 0.1
288
+ - `fp16`: True
289
+ - `batch_sampler`: no_duplicates
290
+
291
+ #### All Hyperparameters
292
+ <details><summary>Click to expand</summary>
293
+
294
+ - `overwrite_output_dir`: False
295
+ - `do_predict`: False
296
+ - `eval_strategy`: steps
297
+ - `prediction_loss_only`: True
298
+ - `per_device_train_batch_size`: 16
299
+ - `per_device_eval_batch_size`: 16
300
+ - `per_gpu_train_batch_size`: None
301
+ - `per_gpu_eval_batch_size`: None
302
+ - `gradient_accumulation_steps`: 1
303
+ - `eval_accumulation_steps`: None
304
+ - `torch_empty_cache_steps`: None
305
+ - `learning_rate`: 5e-05
306
+ - `weight_decay`: 0.0
307
+ - `adam_beta1`: 0.9
308
+ - `adam_beta2`: 0.999
309
+ - `adam_epsilon`: 1e-08
310
+ - `max_grad_norm`: 1.0
311
+ - `num_train_epochs`: 3
312
+ - `max_steps`: -1
313
+ - `lr_scheduler_type`: linear
314
+ - `lr_scheduler_kwargs`: {}
315
+ - `warmup_ratio`: 0.1
316
+ - `warmup_steps`: 0
317
+ - `log_level`: passive
318
+ - `log_level_replica`: warning
319
+ - `log_on_each_node`: True
320
+ - `logging_nan_inf_filter`: True
321
+ - `save_safetensors`: True
322
+ - `save_on_each_node`: False
323
+ - `save_only_model`: False
324
+ - `restore_callback_states_from_checkpoint`: False
325
+ - `no_cuda`: False
326
+ - `use_cpu`: False
327
+ - `use_mps_device`: False
328
+ - `seed`: 42
329
+ - `data_seed`: None
330
+ - `jit_mode_eval`: False
331
+ - `use_ipex`: False
332
+ - `bf16`: False
333
+ - `fp16`: True
334
+ - `fp16_opt_level`: O1
335
+ - `half_precision_backend`: auto
336
+ - `bf16_full_eval`: False
337
+ - `fp16_full_eval`: False
338
+ - `tf32`: None
339
+ - `local_rank`: 0
340
+ - `ddp_backend`: None
341
+ - `tpu_num_cores`: None
342
+ - `tpu_metrics_debug`: False
343
+ - `debug`: []
344
+ - `dataloader_drop_last`: False
345
+ - `dataloader_num_workers`: 0
346
+ - `dataloader_prefetch_factor`: None
347
+ - `past_index`: -1
348
+ - `disable_tqdm`: False
349
+ - `remove_unused_columns`: True
350
+ - `label_names`: None
351
+ - `load_best_model_at_end`: False
352
+ - `ignore_data_skip`: False
353
+ - `fsdp`: []
354
+ - `fsdp_min_num_params`: 0
355
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
356
+ - `fsdp_transformer_layer_cls_to_wrap`: None
357
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
358
+ - `deepspeed`: None
359
+ - `label_smoothing_factor`: 0.0
360
+ - `optim`: adamw_torch
361
+ - `optim_args`: None
362
+ - `adafactor`: False
363
+ - `group_by_length`: False
364
+ - `length_column_name`: length
365
+ - `ddp_find_unused_parameters`: None
366
+ - `ddp_bucket_cap_mb`: None
367
+ - `ddp_broadcast_buffers`: False
368
+ - `dataloader_pin_memory`: True
369
+ - `dataloader_persistent_workers`: False
370
+ - `skip_memory_metrics`: True
371
+ - `use_legacy_prediction_loop`: False
372
+ - `push_to_hub`: False
373
+ - `resume_from_checkpoint`: None
374
+ - `hub_model_id`: None
375
+ - `hub_strategy`: every_save
376
+ - `hub_private_repo`: False
377
+ - `hub_always_push`: False
378
+ - `gradient_checkpointing`: False
379
+ - `gradient_checkpointing_kwargs`: None
380
+ - `include_inputs_for_metrics`: False
381
+ - `eval_do_concat_batches`: True
382
+ - `fp16_backend`: auto
383
+ - `push_to_hub_model_id`: None
384
+ - `push_to_hub_organization`: None
385
+ - `mp_parameters`:
386
+ - `auto_find_batch_size`: False
387
+ - `full_determinism`: False
388
+ - `torchdynamo`: None
389
+ - `ray_scope`: last
390
+ - `ddp_timeout`: 1800
391
+ - `torch_compile`: False
392
+ - `torch_compile_backend`: None
393
+ - `torch_compile_mode`: None
394
+ - `dispatch_batches`: None
395
+ - `split_batches`: None
396
+ - `include_tokens_per_second`: False
397
+ - `include_num_input_tokens_seen`: False
398
+ - `neftune_noise_alpha`: None
399
+ - `optim_target_modules`: None
400
+ - `batch_eval_metrics`: False
401
+ - `eval_on_start`: False
402
+ - `eval_use_gather_object`: False
403
+ - `batch_sampler`: no_duplicates
404
+ - `multi_dataset_batch_sampler`: proportional
405
+
406
+ </details>
407
+
408
+ ### Training Logs
409
+ | Epoch | Step | Training Loss | loss | dev_cosine_accuracy | test_cosine_accuracy |
410
+ |:------:|:----:|:-------------:|:------:|:-------------------:|:--------------------:|
411
+ | 0 | 0 | - | - | 0.8596 | - |
412
+ | 0.2475 | 100 | 2.0428 | 1.3753 | 0.9989 | - |
413
+ | 0.4950 | 200 | 1.5299 | 1.2361 | 1.0 | - |
414
+ | 0.7426 | 300 | 1.4871 | 1.1853 | 1.0 | - |
415
+ | 0.9901 | 400 | 1.4612 | 1.1707 | 1.0 | - |
416
+ | 1.2376 | 500 | 0.0287 | 1.2190 | 1.0 | - |
417
+ | 1.1584 | 600 | 0.9192 | 1.1738 | 1.0 | - |
418
+ | 1.4059 | 700 | 1.4131 | 1.1708 | 1.0 | - |
419
+ | 1.6535 | 800 | 1.4254 | 1.1428 | 1.0 | - |
420
+ | 1.9010 | 900 | 1.3977 | 1.1373 | 1.0 | - |
421
+ | 2.1485 | 1000 | 0.5379 | 1.1419 | 1.0 | - |
422
+ | 2.0693 | 1100 | 0.386 | 1.1306 | 1.0 | - |
423
+ | 2.3168 | 1200 | 1.3708 | 1.1260 | 1.0 | - |
424
+ | 2.3465 | 1212 | - | - | - | 1.0 |
425
+
426
+
427
+ ### Framework Versions
428
+ - Python: 3.11.5
429
+ - Sentence Transformers: 3.1.0
430
+ - Transformers: 4.44.2
431
+ - PyTorch: 2.4.1+cu121
432
+ - Accelerate: 1.0.0
433
+ - Datasets: 3.0.1
434
+ - Tokenizers: 0.19.1
435
+
436
+ ## Citation
437
+
438
+ ### BibTeX
439
+
440
+ #### Sentence Transformers
441
+ ```bibtex
442
+ @inproceedings{reimers-2019-sentence-bert,
443
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
444
+ author = "Reimers, Nils and Gurevych, Iryna",
445
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
446
+ month = "11",
447
+ year = "2019",
448
+ publisher = "Association for Computational Linguistics",
449
+ url = "https://arxiv.org/abs/1908.10084",
450
+ }
451
+ ```
452
+
453
+ #### MultipleNegativesRankingLoss
454
+ ```bibtex
455
+ @misc{henderson2017efficient,
456
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
457
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
458
+ year={2017},
459
+ eprint={1705.00652},
460
+ archivePrefix={arXiv},
461
+ primaryClass={cs.CL}
462
+ }
463
+ ```
464
+
465
+ <!--
466
+ ## Glossary
467
+
468
+ *Clearly define terms in order to be accessible across audiences.*
469
+ -->
470
+
471
+ <!--
472
+ ## Model Card Authors
473
+
474
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
475
+ -->
476
+
477
+ <!--
478
+ ## Model Card Contact
479
+
480
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
481
+ -->
config.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "sentence-transformers/all-mpnet-base-v2",
3
+ "architectures": [
4
+ "MPNetModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "eos_token_id": 2,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 3072,
14
+ "layer_norm_eps": 1e-05,
15
+ "max_position_embeddings": 514,
16
+ "model_type": "mpnet",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 12,
19
+ "pad_token_id": 1,
20
+ "relative_attention_num_buckets": 32,
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.44.2",
23
+ "vocab_size": 30527
24
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.1.0",
4
+ "transformers": "4.44.2",
5
+ "pytorch": "2.4.1+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:371118df31c9b02e38a927b68d37f95379d46ce916947a41304b7b8f7d757a9b
3
+ size 437967672
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 384,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "<s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "</s>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "<mask>",
25
+ "lstrip": true,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "<pad>",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "</s>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
+ "unk_token": {
45
+ "content": "[UNK]",
46
+ "lstrip": false,
47
+ "normalized": false,
48
+ "rstrip": false,
49
+ "single_word": false
50
+ }
51
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,72 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "<s>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "<pad>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "</s>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "3": {
28
+ "content": "<unk>",
29
+ "lstrip": false,
30
+ "normalized": true,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "104": {
36
+ "content": "[UNK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ },
43
+ "30526": {
44
+ "content": "<mask>",
45
+ "lstrip": true,
46
+ "normalized": false,
47
+ "rstrip": false,
48
+ "single_word": false,
49
+ "special": true
50
+ }
51
+ },
52
+ "bos_token": "<s>",
53
+ "clean_up_tokenization_spaces": true,
54
+ "cls_token": "<s>",
55
+ "do_lower_case": true,
56
+ "eos_token": "</s>",
57
+ "mask_token": "<mask>",
58
+ "max_length": 128,
59
+ "model_max_length": 384,
60
+ "pad_to_multiple_of": null,
61
+ "pad_token": "<pad>",
62
+ "pad_token_type_id": 0,
63
+ "padding_side": "right",
64
+ "sep_token": "</s>",
65
+ "stride": 0,
66
+ "strip_accents": null,
67
+ "tokenize_chinese_chars": true,
68
+ "tokenizer_class": "MPNetTokenizer",
69
+ "truncation_side": "right",
70
+ "truncation_strategy": "longest_first",
71
+ "unk_token": "[UNK]"
72
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff