avinot commited on
Commit
a23af80
1 Parent(s): 2d303d6

Add new SentenceTransformer model.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,493 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: distilbert/distilroberta-base
3
+ datasets: []
4
+ language: []
5
+ library_name: sentence-transformers
6
+ pipeline_tag: sentence-similarity
7
+ tags:
8
+ - sentence-transformers
9
+ - sentence-similarity
10
+ - feature-extraction
11
+ - generated_from_trainer
12
+ - dataset_size:3394
13
+ - loss:MultipleNegativesRankingLoss
14
+ widget:
15
+ - source_sentence: 'As senna, Watch your positioning at all times so she cannot land
16
+ crucial spells on you.
17
+
18
+
19
+ Keep the minion wave closer to your side of the map to force her to over-extend
20
+ for farm.
21
+
22
+
23
+ Senna is quite weak early, you might be able to abuse that to gain an early health
24
+ advantage. But if you can’t, don’t worry.
25
+
26
+
27
+ Focus on farming early on if you can’t get kills. Try to poke her down often:
28
+ whenever she uses her Q, try and use yours.
29
+
30
+ '
31
+ sentences:
32
+ - As riven, Riven will begin to fall as the game continues, so she will have to
33
+ accumulate an early lead on. She should rely on peaks to win the game. She will
34
+ have a lot of articles during this phase of the game, which will let her dry a
35
+ lot of damage if she manages to catch an enemy. Level 16 is a massive power peak,
36
+ which means that she can completely decimate the enemies that are joined together.
37
+ She will have to find flanks on the enemy frequently.
38
+ - As senna, Watch your position at all times so it can't put crucial spells on you.
39
+ Keep the minion wave closer to your side of the map to force it to overtake for
40
+ the farm. Senna is weak enough early, you might be able to abuse it to gain an
41
+ early health benefit. But if you can, don't worry. Focus on agriculture early
42
+ if you can't get killed.
43
+ - Against zac, Zac really has long cooling times on his engagement abilities. As
44
+ a result, he can really use them as a tool of engagement or escape only once in
45
+ a fight. If you know that he is without these abilities- try to fight him. Placing
46
+ the vision around his entrances into the jungle and into the river will reduce
47
+ his ability to get successful ganks. Note that Zac can gank from afar with his
48
+ E however. The easiest way for Zac to kill the Crab Scuttle is by using his E.
49
+ If you see him killing the Crab Scuttle, it will usually mean that his E is on
50
+ cooldown and he will miss an escape tool.
51
+ - source_sentence: 'As teemo, Use your t to prevent the enemy ADC from doing anything
52
+ in this game. Always focus on the champion that is close to your carry for maximal
53
+ effectiveness.
54
+
55
+
56
+ Try to lure enemies into fighting near your Ultimate t infested areas. It will
57
+ allow you to auto-win a major fight due to the amount of damage your Ultimate
58
+ t traps do.
59
+
60
+
61
+ Ensure that your Ultimate t traps are in the choke points during neutral objective
62
+ fights. This will make it really hard for the enemy team to do anything in the
63
+ fight.'
64
+ sentences:
65
+ - As taric, Taric will have his Q maxed at level 9. This means that he will heal
66
+ a lot during a fight as long as he has enough mana to constantly use the ability.
67
+ At level 11, Taric will have two points in his Ultimate R. This means that he
68
+ will be able to use the ability frequently, which will make him powerful enough
69
+ in team combat. At level thirteen, Taric will have two of his maximum capabilities
70
+ out. This means that he will excel in team combat and will be successfully peeled
71
+ for his port.
72
+ - As annie, Look for aggressive parts at level 2 that you are stronger than it at
73
+ level 2. Look for aggressive parts when its dizziness is down. Fight it after
74
+ it uses its dizziness that you will win a battle with it. After you have obtained
75
+ your first element, just keep pushing it in and get turn plates. Look for roaming
76
+ opportunities when possible.
77
+ - As teemo, Use your t to prevent the ADC enemy from doing anything in this game.
78
+ Always focus on the champion who is close to your door for maximum efficiency.
79
+ Try to attract enemies into the fight near your infested areas Ultimate t. It
80
+ will allow you to self-win a major fight due to the amount of damage your Ultimate
81
+ t traps do. Make sure your Ultimate t traps are in the choking points during neutral
82
+ objective combats. This will make it really difficult for the enemy team to do
83
+ anything in the fight.
84
+ - source_sentence: 'As senna, Just like in the mid-game, you should stick with your
85
+ Support throughout the later parts of the game. Do not go around the map alone
86
+ as you will die easily.
87
+
88
+
89
+ Do not play super aggressive in team fights. Just kite and auto-attack the nearest
90
+ enemy champion. If you walk too far forward, the enemy will focus you and take
91
+ you down.
92
+
93
+
94
+ Continue to kite in team fights and consistently adapt your positioning. Avoid
95
+ standing still in fights as you’ll be an easy target.'
96
+ sentences:
97
+ - As senna, Just like in the middle of the game, you have to stay with your support
98
+ throughout the later parts of the game. Do not go around the map alone because
99
+ you will die easily. Do not play super aggressive in team fights. Just kite and
100
+ automatic attack the nearest enemy champion. If you walk too far forward, the
101
+ enemy will concentrate and down you. Continue kiteing in team fights and systematically
102
+ adapt your positioning.
103
+ - As udyr, You are really good at securing the goals in early play. Look to secure
104
+ the goals whenever they are. Find a healthy balance between the glove of your
105
+ allies, the agriculture of your jungle and safety goals. Make sure you don't fall
106
+ behind in XP trying to gank constantly. Look several times at the gank tracks
107
+ that are wider or Flashless. Your champion is very good at several times the glove
108
+ tracks over and over again.
109
+ - As bard, Once you reach this stage, you should focus more on roaming and the impact
110
+ of the map with your Ultimate R & E. When you keep in depth, make sure to keep
111
+ your E ready to escape any sticky situation. Before leaving your track, drop some
112
+ W shrines here and there. This will help your ADC stay healthy, especially when
113
+ they are against a poke track. Always try to lure enemy travellers to push and
114
+ use the wall behind them to E and then stun them with your Q. The association
115
+ with your Jungler/other travelers will give maximum efficiency and enemies generally
116
+ do not plan such coins.
117
+ - source_sentence: 'Against trundle, When Trundle activates his n, back away immediately.
118
+ He will become much tankier and your tanks will be weak. Re-engage when his Subjugate
119
+ Subjugate n has cooled off.
120
+
121
+
122
+ He has great pick potential in the mid game with his e. Avoid standing too far
123
+ forward as he may push you towards his team and then all-in you.
124
+
125
+
126
+ Poke is your best friend against Trundle. Getting him really low makes it near
127
+ impossible for him to all-in or fight your team.'
128
+ sentences:
129
+ - As kalista, If she has a support that can complete her escarment power, she will
130
+ easily take the level path one himself. Kalista's passive does so so that she
131
+ can easily avoid the CC abilities especially if it turns out that it is a skill
132
+ shot. It also allows her to jump in and out of nonwarded brushes to drop the vision
133
+ on the enemies. Once she gets her Ultimate x, her down pursuit potential becomes
134
+ unmatched.
135
+ - Against trundle, When Trundle activates his n, backs off immediately. He will
136
+ become much more tankier and your tanks will be weak. Re-engagement when his Subjugate
137
+ Subjugate n has cooled. He has great potential of choice in the middle of the
138
+ match with his e. Avoid standing too far ahead that he can push you towards his
139
+ team and then all of you. Poke is your best friend against Trundle.
140
+ - As viego, Viego is decent enough during the team fights because of his W and Q.
141
+ His life flight will help him during the full fights, and his Ultimate R will
142
+ act as a decent finish movement. Viego will win a massive power peak once he has
143
+ two points in his Ultimate R. He can run squishy targets, execute them and use
144
+ his Passive for much longer than any other champion in his situation. He will
145
+ have his basic elements through this phase of the game. He should be able to flatten
146
+ a lot of damage and catch the out-of-guard enemies with his W. His E should help
147
+ him shoot several ganks as well.
148
+ - source_sentence: 'As mordekaiser, Another point in his Ultimate R will allow him
149
+ to 1 v 1 target quite frequently. He should be able to secure picks even now.
150
+
151
+
152
+ He is pretty decent during late-game fights, as his AoE abilities will hurt a
153
+ lot. His survivability is quite appreciable as well.
154
+
155
+
156
+ His tankiness will be massive during this phase of the game. He should be focused
157
+ on absorbing a lot of damage for the enemy team while simultaneously picking off
158
+ enemies when possible.'
159
+ sentences:
160
+ - As karma, Unlike most champions, Karma unlocks her Ultimate has at level 1. This
161
+ means that at level 6, she doesn't have as much pressure compared to many other
162
+ champions especially if they re-level 6 power peak is very strong. When Karmas
163
+ d is down, she is vulnerable to attack and will need to play safely while he is
164
+ on the cooldown. His Q can easily be avoided if the enemy builds a huge wave or
165
+ positions properly.
166
+ - As mordekaiser, Another point in his Ultimate R will allow him to 1 v 1 target
167
+ quite frequently. He should be able to secure the choices even now. He is quite
168
+ decent during the fights at the end of the game, because his AoE abilities will
169
+ do a lot of harm. His survival is just as appreciable. His tankness will be massive
170
+ during this phase of the game. He should be focused on absorbing a lot of damage
171
+ for the enemy team while simultaneously dropping enemies when possible.
172
+ - As renekton, If you can get a murder or two, you can snowball your lead quite
173
+ quickly. Every time your Ultimate t is up, you can look for aggressive games to
174
+ try to kill the enemy. Your Ultimate t is an excellent trading tool that makes
175
+ you much stronger. Keep the minion wave even or slightly closer to your side of
176
+ the map early. This will allow you to run the enemy down while protecting yourself
177
+ from the ganks. If you keep pushing when you are not forward, you will be unable
178
+ to run the enemy down and you will be an easy target for the enemy Jungler.
179
+ ---
180
+
181
+ # SentenceTransformer based on distilbert/distilroberta-base
182
+
183
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [distilbert/distilroberta-base](https://huggingface.co/distilbert/distilroberta-base). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
184
+
185
+ ## Model Details
186
+
187
+ ### Model Description
188
+ - **Model Type:** Sentence Transformer
189
+ - **Base model:** [distilbert/distilroberta-base](https://huggingface.co/distilbert/distilroberta-base) <!-- at revision fb53ab8802853c8e4fbdbcd0529f21fc6f459b2b -->
190
+ - **Maximum Sequence Length:** 512 tokens
191
+ - **Output Dimensionality:** 768 tokens
192
+ - **Similarity Function:** Cosine Similarity
193
+ <!-- - **Training Dataset:** Unknown -->
194
+ <!-- - **Language:** Unknown -->
195
+ <!-- - **License:** Unknown -->
196
+
197
+ ### Model Sources
198
+
199
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
200
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
201
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
202
+
203
+ ### Full Model Architecture
204
+
205
+ ```
206
+ SentenceTransformer(
207
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: RobertaModel
208
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
209
+ )
210
+ ```
211
+
212
+ ## Usage
213
+
214
+ ### Direct Usage (Sentence Transformers)
215
+
216
+ First install the Sentence Transformers library:
217
+
218
+ ```bash
219
+ pip install -U sentence-transformers
220
+ ```
221
+
222
+ Then you can load this model and run inference.
223
+ ```python
224
+ from sentence_transformers import SentenceTransformer
225
+
226
+ # Download from the 🤗 Hub
227
+ model = SentenceTransformer("avinot/distilroberta-base-LoL-Champions")
228
+ # Run inference
229
+ sentences = [
230
+ 'As mordekaiser, Another point in his Ultimate R will allow him to 1 v 1 target quite frequently. He should be able to secure picks even now.\n\nHe is pretty decent during late-game fights, as his AoE abilities will hurt a lot. His survivability is quite appreciable as well.\n\nHis tankiness will be massive during this phase of the game. He should be focused on absorbing a lot of damage for the enemy team while simultaneously picking off enemies when possible.',
231
+ 'As mordekaiser, Another point in his Ultimate R will allow him to 1 v 1 target quite frequently. He should be able to secure the choices even now. He is quite decent during the fights at the end of the game, because his AoE abilities will do a lot of harm. His survival is just as appreciable. His tankness will be massive during this phase of the game. He should be focused on absorbing a lot of damage for the enemy team while simultaneously dropping enemies when possible.',
232
+ 'As renekton, If you can get a murder or two, you can snowball your lead quite quickly. Every time your Ultimate t is up, you can look for aggressive games to try to kill the enemy. Your Ultimate t is an excellent trading tool that makes you much stronger. Keep the minion wave even or slightly closer to your side of the map early. This will allow you to run the enemy down while protecting yourself from the ganks. If you keep pushing when you are not forward, you will be unable to run the enemy down and you will be an easy target for the enemy Jungler.',
233
+ ]
234
+ embeddings = model.encode(sentences)
235
+ print(embeddings.shape)
236
+ # [3, 768]
237
+
238
+ # Get the similarity scores for the embeddings
239
+ similarities = model.similarity(embeddings, embeddings)
240
+ print(similarities.shape)
241
+ # [3, 3]
242
+ ```
243
+
244
+ <!--
245
+ ### Direct Usage (Transformers)
246
+
247
+ <details><summary>Click to see the direct usage in Transformers</summary>
248
+
249
+ </details>
250
+ -->
251
+
252
+ <!--
253
+ ### Downstream Usage (Sentence Transformers)
254
+
255
+ You can finetune this model on your own dataset.
256
+
257
+ <details><summary>Click to expand</summary>
258
+
259
+ </details>
260
+ -->
261
+
262
+ <!--
263
+ ### Out-of-Scope Use
264
+
265
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
266
+ -->
267
+
268
+ <!--
269
+ ## Bias, Risks and Limitations
270
+
271
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
272
+ -->
273
+
274
+ <!--
275
+ ### Recommendations
276
+
277
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
278
+ -->
279
+
280
+ ## Training Details
281
+
282
+ ### Training Dataset
283
+
284
+ #### Unnamed Dataset
285
+
286
+
287
+ * Size: 3,394 training samples
288
+ * Columns: <code>sentence_0</code> and <code>sentence_1</code>
289
+ * Approximate statistics based on the first 1000 samples:
290
+ | | sentence_0 | sentence_1 |
291
+ |:--------|:------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
292
+ | type | string | string |
293
+ | details | <ul><li>min: 6 tokens</li><li>mean: 125.49 tokens</li><li>max: 272 tokens</li></ul> | <ul><li>min: 27 tokens</li><li>mean: 109.21 tokens</li><li>max: 231 tokens</li></ul> |
294
+ * Samples:
295
+ | sentence_0 | sentence_1 |
296
+ |||
297
+ | <code>As sivir, Just like in the mid-game, you should stick with your Support throughout the later parts of the game. Do not go around the map alone as you will die easily.<br><br>Do not play super aggressive in team fights. Just kite and auto-attack the nearest enemy champion. If you walk too far forward, the enemy will focus you and take you down.<br><br>Continue to kite in team fights and consistently adapt your positioning. Avoid standing still in fights as you’ll be an easy target.</code> | <code>As sivir, Just like in the middle of the game, you have to stay with your support throughout the later parts of the game. Do not go around the map alone because you will die easily. Do not play super aggressive in team fights. Just kite and automatic attack the nearest enemy champion. If you walk too far forward, the enemy will concentrate and down you. Continue kiteing in team fights and systematically adapt your positioning.</code> |
298
+ | <code>As nunu, After going in with your Ultimate R, be prepared to fall back and peel for your allies in late-game team fights.<br><br>Play around your Ultimate R in the later parts of the game. Avoid fighting unless your Ultimate R is up. Fighting without it will make the late-game team fights much harder. Delay fights and be prepared to disengage if it’s still on cooldown.<br><br>To make getting on the enemy backline easier, group with your team but stay off to the side. If you flank from an unwarded bush, the enemy will find it harder to react to your all-in. Avoid splitting or being away from your team in the late game as the enemy will force a fight while you’re gone.</code> | <code>As nunu, After entering with your Ultimate R, be ready to fold and peel for your allies in the team fights at the end of the game. Play around your Ultimate R in the later parts of the game. Avoid fighting unless your Ultimate R is standing. Fighting without it will make the team fight a lot harder at the end of the game. Delaying the fights and being ready to disengage if it is still about to cool off. To make the enemy's attack easier, group with your team but stay away from the side. If you flank a bush not awarded, the enemy will find it harder to react to your all-in. Avoid separating or being away from your team in the end of the game as the enemy will force a fight while you are gone.</code> |
299
+ | <code>As darius, Darius is one of the strongest early game champions in the game. You can use this advantage to gain an early lead.<br><br>Extended trades work in the favour of Darius thanks to his Passive. He will win almost every auto-attack battle. This makes him a great duelist in low ELO as many players like to constantly fight.<br><br>Is really good in team fights thanks to his Ultimate e which can be reset if he gets the killing blow with it.</code> | <code>As darius, Darius is one of the first strongest game champions in the game. You can use this advantage to gain an early lead. Extended trades work in favor of Darius thanks to his passive. He will win almost all automatic attack battles. This makes him a great duelist in the bottom ELO as many players like to fight constantly. Is really good in team fights thanks to his Ultimate e that can be reset if he gets the shot of death with her.</code> |
300
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
301
+ ```json
302
+ {
303
+ "scale": 20.0,
304
+ "similarity_fct": "cos_sim"
305
+ }
306
+ ```
307
+
308
+ ### Training Hyperparameters
309
+ #### Non-Default Hyperparameters
310
+
311
+ - `per_device_train_batch_size`: 16
312
+ - `per_device_eval_batch_size`: 16
313
+ - `num_train_epochs`: 10
314
+ - `multi_dataset_batch_sampler`: round_robin
315
+
316
+ #### All Hyperparameters
317
+ <details><summary>Click to expand</summary>
318
+
319
+ - `overwrite_output_dir`: False
320
+ - `do_predict`: False
321
+ - `eval_strategy`: no
322
+ - `prediction_loss_only`: True
323
+ - `per_device_train_batch_size`: 16
324
+ - `per_device_eval_batch_size`: 16
325
+ - `per_gpu_train_batch_size`: None
326
+ - `per_gpu_eval_batch_size`: None
327
+ - `gradient_accumulation_steps`: 1
328
+ - `eval_accumulation_steps`: None
329
+ - `learning_rate`: 5e-05
330
+ - `weight_decay`: 0.0
331
+ - `adam_beta1`: 0.9
332
+ - `adam_beta2`: 0.999
333
+ - `adam_epsilon`: 1e-08
334
+ - `max_grad_norm`: 1
335
+ - `num_train_epochs`: 10
336
+ - `max_steps`: -1
337
+ - `lr_scheduler_type`: linear
338
+ - `lr_scheduler_kwargs`: {}
339
+ - `warmup_ratio`: 0.0
340
+ - `warmup_steps`: 0
341
+ - `log_level`: passive
342
+ - `log_level_replica`: warning
343
+ - `log_on_each_node`: True
344
+ - `logging_nan_inf_filter`: True
345
+ - `save_safetensors`: True
346
+ - `save_on_each_node`: False
347
+ - `save_only_model`: False
348
+ - `restore_callback_states_from_checkpoint`: False
349
+ - `no_cuda`: False
350
+ - `use_cpu`: False
351
+ - `use_mps_device`: False
352
+ - `seed`: 42
353
+ - `data_seed`: None
354
+ - `jit_mode_eval`: False
355
+ - `use_ipex`: False
356
+ - `bf16`: False
357
+ - `fp16`: False
358
+ - `fp16_opt_level`: O1
359
+ - `half_precision_backend`: auto
360
+ - `bf16_full_eval`: False
361
+ - `fp16_full_eval`: False
362
+ - `tf32`: None
363
+ - `local_rank`: 0
364
+ - `ddp_backend`: None
365
+ - `tpu_num_cores`: None
366
+ - `tpu_metrics_debug`: False
367
+ - `debug`: []
368
+ - `dataloader_drop_last`: False
369
+ - `dataloader_num_workers`: 0
370
+ - `dataloader_prefetch_factor`: None
371
+ - `past_index`: -1
372
+ - `disable_tqdm`: False
373
+ - `remove_unused_columns`: True
374
+ - `label_names`: None
375
+ - `load_best_model_at_end`: False
376
+ - `ignore_data_skip`: False
377
+ - `fsdp`: []
378
+ - `fsdp_min_num_params`: 0
379
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
380
+ - `fsdp_transformer_layer_cls_to_wrap`: None
381
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
382
+ - `deepspeed`: None
383
+ - `label_smoothing_factor`: 0.0
384
+ - `optim`: adamw_torch
385
+ - `optim_args`: None
386
+ - `adafactor`: False
387
+ - `group_by_length`: False
388
+ - `length_column_name`: length
389
+ - `ddp_find_unused_parameters`: None
390
+ - `ddp_bucket_cap_mb`: None
391
+ - `ddp_broadcast_buffers`: False
392
+ - `dataloader_pin_memory`: True
393
+ - `dataloader_persistent_workers`: False
394
+ - `skip_memory_metrics`: True
395
+ - `use_legacy_prediction_loop`: False
396
+ - `push_to_hub`: False
397
+ - `resume_from_checkpoint`: None
398
+ - `hub_model_id`: None
399
+ - `hub_strategy`: every_save
400
+ - `hub_private_repo`: False
401
+ - `hub_always_push`: False
402
+ - `gradient_checkpointing`: False
403
+ - `gradient_checkpointing_kwargs`: None
404
+ - `include_inputs_for_metrics`: False
405
+ - `eval_do_concat_batches`: True
406
+ - `fp16_backend`: auto
407
+ - `push_to_hub_model_id`: None
408
+ - `push_to_hub_organization`: None
409
+ - `mp_parameters`:
410
+ - `auto_find_batch_size`: False
411
+ - `full_determinism`: False
412
+ - `torchdynamo`: None
413
+ - `ray_scope`: last
414
+ - `ddp_timeout`: 1800
415
+ - `torch_compile`: False
416
+ - `torch_compile_backend`: None
417
+ - `torch_compile_mode`: None
418
+ - `dispatch_batches`: None
419
+ - `split_batches`: None
420
+ - `include_tokens_per_second`: False
421
+ - `include_num_input_tokens_seen`: False
422
+ - `neftune_noise_alpha`: None
423
+ - `optim_target_modules`: None
424
+ - `batch_eval_metrics`: False
425
+ - `batch_sampler`: batch_sampler
426
+ - `multi_dataset_batch_sampler`: round_robin
427
+
428
+ </details>
429
+
430
+ ### Training Logs
431
+ | Epoch | Step | Training Loss |
432
+ |:------:|:----:|:-------------:|
433
+ | 2.3474 | 500 | 1.9312 |
434
+ | 4.6948 | 1000 | 0.0145 |
435
+ | 7.0423 | 1500 | 0.0023 |
436
+ | 9.3897 | 2000 | 0.0003 |
437
+
438
+
439
+ ### Framework Versions
440
+ - Python: 3.10.12
441
+ - Sentence Transformers: 3.0.1
442
+ - Transformers: 4.41.2
443
+ - PyTorch: 2.3.0+cu121
444
+ - Accelerate: 0.31.0
445
+ - Datasets: 2.20.0
446
+ - Tokenizers: 0.19.1
447
+
448
+ ## Citation
449
+
450
+ ### BibTeX
451
+
452
+ #### Sentence Transformers
453
+ ```bibtex
454
+ @inproceedings{reimers-2019-sentence-bert,
455
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
456
+ author = "Reimers, Nils and Gurevych, Iryna",
457
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
458
+ month = "11",
459
+ year = "2019",
460
+ publisher = "Association for Computational Linguistics",
461
+ url = "https://arxiv.org/abs/1908.10084",
462
+ }
463
+ ```
464
+
465
+ #### MultipleNegativesRankingLoss
466
+ ```bibtex
467
+ @misc{henderson2017efficient,
468
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
469
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
470
+ year={2017},
471
+ eprint={1705.00652},
472
+ archivePrefix={arXiv},
473
+ primaryClass={cs.CL}
474
+ }
475
+ ```
476
+
477
+ <!--
478
+ ## Glossary
479
+
480
+ *Clearly define terms in order to be accessible across audiences.*
481
+ -->
482
+
483
+ <!--
484
+ ## Model Card Authors
485
+
486
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
487
+ -->
488
+
489
+ <!--
490
+ ## Model Card Contact
491
+
492
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
493
+ -->
config.json ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "distilroberta-base",
3
+ "architectures": [
4
+ "RobertaModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "classifier_dropout": null,
9
+ "eos_token_id": 2,
10
+ "hidden_act": "gelu",
11
+ "hidden_dropout_prob": 0.1,
12
+ "hidden_size": 768,
13
+ "initializer_range": 0.02,
14
+ "intermediate_size": 3072,
15
+ "layer_norm_eps": 1e-05,
16
+ "max_position_embeddings": 514,
17
+ "model_type": "roberta",
18
+ "num_attention_heads": 12,
19
+ "num_hidden_layers": 6,
20
+ "pad_token_id": 1,
21
+ "position_embedding_type": "absolute",
22
+ "torch_dtype": "float32",
23
+ "transformers_version": "4.41.2",
24
+ "type_vocab_size": 1,
25
+ "use_cache": true,
26
+ "vocab_size": 50265
27
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.0.1",
4
+ "transformers": "4.41.2",
5
+ "pytorch": "2.3.0+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2154966493ab7678f596368166ba9b2c24d5620994841fd78852235a4fea9894
3
+ size 328485128
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "<s>",
3
+ "cls_token": "<s>",
4
+ "eos_token": "</s>",
5
+ "mask_token": {
6
+ "content": "<mask>",
7
+ "lstrip": true,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false
11
+ },
12
+ "pad_token": "<pad>",
13
+ "sep_token": "</s>",
14
+ "unk_token": "<unk>"
15
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": false,
3
+ "added_tokens_decoder": {
4
+ "0": {
5
+ "content": "<s>",
6
+ "lstrip": false,
7
+ "normalized": true,
8
+ "rstrip": false,
9
+ "single_word": false,
10
+ "special": true
11
+ },
12
+ "1": {
13
+ "content": "<pad>",
14
+ "lstrip": false,
15
+ "normalized": true,
16
+ "rstrip": false,
17
+ "single_word": false,
18
+ "special": true
19
+ },
20
+ "2": {
21
+ "content": "</s>",
22
+ "lstrip": false,
23
+ "normalized": true,
24
+ "rstrip": false,
25
+ "single_word": false,
26
+ "special": true
27
+ },
28
+ "3": {
29
+ "content": "<unk>",
30
+ "lstrip": false,
31
+ "normalized": true,
32
+ "rstrip": false,
33
+ "single_word": false,
34
+ "special": true
35
+ },
36
+ "50264": {
37
+ "content": "<mask>",
38
+ "lstrip": true,
39
+ "normalized": false,
40
+ "rstrip": false,
41
+ "single_word": false,
42
+ "special": true
43
+ }
44
+ },
45
+ "bos_token": "<s>",
46
+ "clean_up_tokenization_spaces": true,
47
+ "cls_token": "<s>",
48
+ "eos_token": "</s>",
49
+ "errors": "replace",
50
+ "mask_token": "<mask>",
51
+ "model_max_length": 512,
52
+ "pad_token": "<pad>",
53
+ "sep_token": "</s>",
54
+ "tokenizer_class": "RobertaTokenizer",
55
+ "trim_offsets": true,
56
+ "unk_token": "<unk>"
57
+ }
vocab.json ADDED
The diff for this file is too large to render. See raw diff