bhujith10 commited on
Commit
aaedd2d
·
verified ·
1 Parent(s): 54c979d

Push model using huggingface_hub.

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,438 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: microsoft/deberta-v3-base
3
+ datasets:
4
+ - bhujith10/multi_class_classification_dataset
5
+ library_name: setfit
6
+ metrics:
7
+ - accuracy
8
+ pipeline_tag: text-classification
9
+ tags:
10
+ - setfit
11
+ - sentence-transformers
12
+ - text-classification
13
+ - generated_from_setfit_trainer
14
+ widget:
15
+ - text: 'Title: Detecting Adversarial Samples Using Density Ratio Estimates,
16
+
17
+ Abstract: Machine learning models, especially based on deep architectures are
18
+ used in
19
+
20
+ everyday applications ranging from self driving cars to medical diagnostics. It
21
+
22
+ has been shown that such models are dangerously susceptible to adversarial
23
+
24
+ samples, indistinguishable from real samples to human eye, adversarial samples
25
+
26
+ lead to incorrect classifications with high confidence. Impact of adversarial
27
+
28
+ samples is far-reaching and their efficient detection remains an open problem.
29
+
30
+ We propose to use direct density ratio estimation as an efficient model
31
+
32
+ agnostic measure to detect adversarial samples. Our proposed method works
33
+
34
+ equally well with single and multi-channel samples, and with different
35
+
36
+ adversarial sample generation methods. We also propose a method to use density
37
+
38
+ ratio estimates for generating adversarial samples with an added constraint of
39
+
40
+ preserving density ratio.'
41
+ - text: 'Title: Dynamics of exciton magnetic polarons in CdMnSe/CdMgSe quantum wells:
42
+ the effect of self-localization,
43
+
44
+ Abstract: We study the exciton magnetic polaron (EMP) formation in (Cd,Mn)Se/(Cd,Mg)Se
45
+
46
+ diluted-magnetic-semiconductor quantum wells using time-resolved
47
+
48
+ photoluminescence (PL). The magnetic field and temperature dependencies of this
49
+
50
+ dynamics allow us to separate the non-magnetic and magnetic contributions to
51
+
52
+ the exciton localization. We deduce the EMP energy of 14 meV, which is in
53
+
54
+ agreement with time-integrated measurements based on selective excitation and
55
+
56
+ the magnetic field dependence of the PL circular polarization degree. The
57
+
58
+ polaron formation time of 500 ps is significantly longer than the corresponding
59
+
60
+ values reported earlier. We propose that this behavior is related to strong
61
+
62
+ self-localization of the EMP, accompanied with a squeezing of the heavy-hole
63
+
64
+ envelope wavefunction. This conclusion is also supported by the decrease of the
65
+
66
+ exciton lifetime from 600 ps to 200 - 400 ps with increasing magnetic field and
67
+
68
+ temperature.'
69
+ - text: 'Title: Exponential Sums and Riesz energies,
70
+
71
+ Abstract: We bound an exponential sum that appears in the study of irregularities
72
+ of
73
+
74
+ distribution (the low-frequency Fourier energy of the sum of several Dirac
75
+
76
+ measures) by geometric quantities: a special case is that for all $\left\{ x_1,
77
+
78
+ \dots, x_N\right\} \subset \mathbb{T}^2$, $X \geq 1$ and a universal $c>0$ $$
79
+
80
+ \sum_{i,j=1}^{N}{ \frac{X^2}{1 + X^4 \|x_i -x_j\|^4}} \lesssim \sum_{k \in
81
+
82
+ \mathbb{Z}^2 \atop \|k\| \leq X}{ \left| \sum_{n=1}^{N}{ e^{2 \pi i
83
+
84
+ \left\langle k, x_n \right\rangle}}\right|^2} \lesssim \sum_{i,j=1}^{N}{ X^2
85
+
86
+ e^{-c X^2\|x_i -x_j\|^2}}.$$ Since this exponential sum is intimately tied to
87
+
88
+ rather subtle distribution properties of the points, we obtain nonlocal
89
+
90
+ structural statements for near-minimizers of the Riesz-type energy. In the
91
+
92
+ regime $X \gtrsim N^{1/2}$ both upper and lower bound match for
93
+
94
+ maximally-separated point sets satisfying $\|x_i -x_j\| \gtrsim N^{-1/2}$.'
95
+ - text: 'Title: Influence of Spin Orbit Coupling in the Iron-Based Superconductors,
96
+
97
+ Abstract: We report on the influence of spin-orbit coupling (SOC) in the Fe-based
98
+
99
+ superconductors (FeSCs) via application of circularly-polarized spin and
100
+
101
+ angle-resolved photoemission spectroscopy. We combine this technique in
102
+
103
+ representative members of both the Fe-pnictides and Fe-chalcogenides with ab
104
+
105
+ initio density functional theory and tight-binding calculations to establish an
106
+
107
+ ubiquitous modification of the electronic structure in these materials imbued
108
+
109
+ by SOC. The influence of SOC is found to be concentrated on the hole pockets
110
+
111
+ where the superconducting gap is generally found to be largest. This result
112
+
113
+ contests descriptions of superconductivity in these materials in terms of pure
114
+
115
+ spin-singlet eigenstates, raising questions regarding the possible pairing
116
+
117
+ mechanisms and role of SOC therein.'
118
+ - text: 'Title: Zero-point spin-fluctuations of single adatoms,
119
+
120
+ Abstract: Stabilizing the magnetic signal of single adatoms is a crucial step
121
+ towards
122
+
123
+ their successful usage in widespread technological applications such as
124
+
125
+ high-density magnetic data storage devices. The quantum mechanical nature of
126
+
127
+ these tiny objects, however, introduces intrinsic zero-point spin-fluctuations
128
+
129
+ that tend to destabilize the local magnetic moment of interest by dwindling the
130
+
131
+ magnetic anisotropy potential barrier even at absolute zero temperature. Here,
132
+
133
+ we elucidate the origins and quantify the effect of the fundamental ingredients
134
+
135
+ determining the magnitude of the fluctuations, namely the ($i$) local magnetic
136
+
137
+ moment, ($ii$) spin-orbit coupling and ($iii$) electron-hole Stoner
138
+
139
+ excitations. Based on a systematic first-principles study of 3d and 4d adatoms,
140
+
141
+ we demonstrate that the transverse contribution of the fluctuations is
142
+
143
+ comparable in size to the magnetic moment itself, leading to a remarkable
144
+
145
+ $\gtrsim$50$\%$ reduction of the magnetic anisotropy energy. Our analysis gives
146
+
147
+ rise to a comprehensible diagram relating the fluctuation magnitude to
148
+
149
+ characteristic features of adatoms, providing practical guidelines for
150
+
151
+ designing magnetically stable nanomagnets with minimal quantum fluctuations.'
152
+ inference: false
153
+ ---
154
+
155
+ # SetFit with microsoft/deberta-v3-base
156
+
157
+ This is a [SetFit](https://github.com/huggingface/setfit) model trained on the [bhujith10/multi_class_classification_dataset](https://huggingface.co/datasets/bhujith10/multi_class_classification_dataset) dataset that can be used for Text Classification. This SetFit model uses [microsoft/deberta-v3-base](https://huggingface.co/microsoft/deberta-v3-base) as the Sentence Transformer embedding model. A [SetFitHead](huggingface.co/docs/setfit/reference/main#setfit.SetFitHead) instance is used for classification.
158
+
159
+ The model has been trained using an efficient few-shot learning technique that involves:
160
+
161
+ 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
162
+ 2. Training a classification head with features from the fine-tuned Sentence Transformer.
163
+
164
+ ## Model Details
165
+
166
+ ### Model Description
167
+ - **Model Type:** SetFit
168
+ - **Sentence Transformer body:** [microsoft/deberta-v3-base](https://huggingface.co/microsoft/deberta-v3-base)
169
+ - **Classification head:** a [SetFitHead](huggingface.co/docs/setfit/reference/main#setfit.SetFitHead) instance
170
+ - **Maximum Sequence Length:** 512 tokens
171
+ - **Number of Classes:** 6 classes
172
+ - **Training Dataset:** [bhujith10/multi_class_classification_dataset](https://huggingface.co/datasets/bhujith10/multi_class_classification_dataset)
173
+ <!-- - **Language:** Unknown -->
174
+ <!-- - **License:** Unknown -->
175
+
176
+ ### Model Sources
177
+
178
+ - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
179
+ - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
180
+ - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
181
+
182
+ ## Uses
183
+
184
+ ### Direct Use for Inference
185
+
186
+ First install the SetFit library:
187
+
188
+ ```bash
189
+ pip install setfit
190
+ ```
191
+
192
+ Then you can load this model and run inference.
193
+
194
+ ```python
195
+ from setfit import SetFitModel
196
+
197
+ # Download from the 🤗 Hub
198
+ model = SetFitModel.from_pretrained("bhujith10/deberta-v3-base-setfit_finetuned")
199
+ # Run inference
200
+ preds = model("Title: Influence of Spin Orbit Coupling in the Iron-Based Superconductors,
201
+ Abstract: We report on the influence of spin-orbit coupling (SOC) in the Fe-based
202
+ superconductors (FeSCs) via application of circularly-polarized spin and
203
+ angle-resolved photoemission spectroscopy. We combine this technique in
204
+ representative members of both the Fe-pnictides and Fe-chalcogenides with ab
205
+ initio density functional theory and tight-binding calculations to establish an
206
+ ubiquitous modification of the electronic structure in these materials imbued
207
+ by SOC. The influence of SOC is found to be concentrated on the hole pockets
208
+ where the superconducting gap is generally found to be largest. This result
209
+ contests descriptions of superconductivity in these materials in terms of pure
210
+ spin-singlet eigenstates, raising questions regarding the possible pairing
211
+ mechanisms and role of SOC therein.")
212
+ ```
213
+
214
+ <!--
215
+ ### Downstream Use
216
+
217
+ *List how someone could finetune this model on their own dataset.*
218
+ -->
219
+
220
+ <!--
221
+ ### Out-of-Scope Use
222
+
223
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
224
+ -->
225
+
226
+ <!--
227
+ ## Bias, Risks and Limitations
228
+
229
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
230
+ -->
231
+
232
+ <!--
233
+ ### Recommendations
234
+
235
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
236
+ -->
237
+
238
+ ## Training Details
239
+
240
+ ### Training Set Metrics
241
+ | Training set | Min | Median | Max |
242
+ |:-------------|:----|:-------|:----|
243
+ | Word count | 23 | 148.1 | 303 |
244
+
245
+ ### Training Hyperparameters
246
+ - batch_size: (4, 4)
247
+ - num_epochs: (1, 1)
248
+ - max_steps: -1
249
+ - sampling_strategy: oversampling
250
+ - body_learning_rate: (2e-05, 1e-05)
251
+ - head_learning_rate: 0.01
252
+ - loss: CosineSimilarityLoss
253
+ - distance_metric: cosine_distance
254
+ - margin: 0.25
255
+ - end_to_end: False
256
+ - use_amp: False
257
+ - warmup_proportion: 0.1
258
+ - l2_weight: 0.01
259
+ - seed: 42
260
+ - eval_max_steps: -1
261
+ - load_best_model_at_end: True
262
+
263
+ ### Training Results
264
+ | Epoch | Step | Training Loss | Validation Loss |
265
+ |:------:|:----:|:-------------:|:---------------:|
266
+ | 0.0002 | 1 | 0.4731 | - |
267
+ | 0.0078 | 50 | 0.4561 | - |
268
+ | 0.0155 | 100 | 0.4156 | - |
269
+ | 0.0233 | 150 | 0.2469 | - |
270
+ | 0.0311 | 200 | 0.2396 | - |
271
+ | 0.0388 | 250 | 0.2376 | - |
272
+ | 0.0466 | 300 | 0.2519 | - |
273
+ | 0.0543 | 350 | 0.1987 | - |
274
+ | 0.0621 | 400 | 0.1908 | - |
275
+ | 0.0699 | 450 | 0.161 | - |
276
+ | 0.0776 | 500 | 0.1532 | - |
277
+ | 0.0854 | 550 | 0.17 | - |
278
+ | 0.0932 | 600 | 0.139 | - |
279
+ | 0.1009 | 650 | 0.1406 | - |
280
+ | 0.1087 | 700 | 0.1239 | - |
281
+ | 0.1165 | 750 | 0.1332 | - |
282
+ | 0.1242 | 800 | 0.1566 | - |
283
+ | 0.1320 | 850 | 0.0932 | - |
284
+ | 0.1398 | 900 | 0.1101 | - |
285
+ | 0.1475 | 950 | 0.1153 | - |
286
+ | 0.1553 | 1000 | 0.0979 | - |
287
+ | 0.1630 | 1050 | 0.0741 | - |
288
+ | 0.1708 | 1100 | 0.0603 | - |
289
+ | 0.1786 | 1150 | 0.1027 | - |
290
+ | 0.1863 | 1200 | 0.0948 | - |
291
+ | 0.1941 | 1250 | 0.0968 | - |
292
+ | 0.2019 | 1300 | 0.085 | - |
293
+ | 0.2096 | 1350 | 0.0883 | - |
294
+ | 0.2174 | 1400 | 0.0792 | - |
295
+ | 0.2252 | 1450 | 0.1054 | - |
296
+ | 0.2329 | 1500 | 0.0556 | - |
297
+ | 0.2407 | 1550 | 0.0777 | - |
298
+ | 0.2484 | 1600 | 0.0922 | - |
299
+ | 0.2562 | 1650 | 0.076 | - |
300
+ | 0.2640 | 1700 | 0.0693 | - |
301
+ | 0.2717 | 1750 | 0.0857 | - |
302
+ | 0.2795 | 1800 | 0.0907 | - |
303
+ | 0.2873 | 1850 | 0.0621 | - |
304
+ | 0.2950 | 1900 | 0.0792 | - |
305
+ | 0.3028 | 1950 | 0.0608 | - |
306
+ | 0.3106 | 2000 | 0.052 | - |
307
+ | 0.3183 | 2050 | 0.056 | - |
308
+ | 0.3261 | 2100 | 0.0501 | - |
309
+ | 0.3339 | 2150 | 0.0559 | - |
310
+ | 0.3416 | 2200 | 0.0526 | - |
311
+ | 0.3494 | 2250 | 0.0546 | - |
312
+ | 0.3571 | 2300 | 0.0398 | - |
313
+ | 0.3649 | 2350 | 0.0527 | - |
314
+ | 0.3727 | 2400 | 0.0522 | - |
315
+ | 0.3804 | 2450 | 0.0468 | - |
316
+ | 0.3882 | 2500 | 0.0465 | - |
317
+ | 0.3960 | 2550 | 0.0393 | - |
318
+ | 0.4037 | 2600 | 0.0583 | - |
319
+ | 0.4115 | 2650 | 0.0278 | - |
320
+ | 0.4193 | 2700 | 0.0502 | - |
321
+ | 0.4270 | 2750 | 0.0413 | - |
322
+ | 0.4348 | 2800 | 0.0538 | - |
323
+ | 0.4425 | 2850 | 0.0361 | - |
324
+ | 0.4503 | 2900 | 0.0648 | - |
325
+ | 0.4581 | 2950 | 0.0459 | - |
326
+ | 0.4658 | 3000 | 0.0521 | - |
327
+ | 0.4736 | 3050 | 0.0288 | - |
328
+ | 0.4814 | 3100 | 0.0323 | - |
329
+ | 0.4891 | 3150 | 0.0335 | - |
330
+ | 0.4969 | 3200 | 0.0472 | - |
331
+ | 0.5047 | 3250 | 0.0553 | - |
332
+ | 0.5124 | 3300 | 0.0426 | - |
333
+ | 0.5202 | 3350 | 0.0276 | - |
334
+ | 0.5280 | 3400 | 0.0395 | - |
335
+ | 0.5357 | 3450 | 0.042 | - |
336
+ | 0.5435 | 3500 | 0.0343 | - |
337
+ | 0.5512 | 3550 | 0.0314 | - |
338
+ | 0.5590 | 3600 | 0.0266 | - |
339
+ | 0.5668 | 3650 | 0.0314 | - |
340
+ | 0.5745 | 3700 | 0.0379 | - |
341
+ | 0.5823 | 3750 | 0.0485 | - |
342
+ | 0.5901 | 3800 | 0.0311 | - |
343
+ | 0.5978 | 3850 | 0.0415 | - |
344
+ | 0.6056 | 3900 | 0.0266 | - |
345
+ | 0.6134 | 3950 | 0.0384 | - |
346
+ | 0.6211 | 4000 | 0.0348 | - |
347
+ | 0.6289 | 4050 | 0.0298 | - |
348
+ | 0.6366 | 4100 | 0.032 | - |
349
+ | 0.6444 | 4150 | 0.031 | - |
350
+ | 0.6522 | 4200 | 0.0367 | - |
351
+ | 0.6599 | 4250 | 0.0289 | - |
352
+ | 0.6677 | 4300 | 0.0333 | - |
353
+ | 0.6755 | 4350 | 0.0281 | - |
354
+ | 0.6832 | 4400 | 0.0307 | - |
355
+ | 0.6910 | 4450 | 0.0312 | - |
356
+ | 0.6988 | 4500 | 0.0488 | - |
357
+ | 0.7065 | 4550 | 0.03 | - |
358
+ | 0.7143 | 4600 | 0.0309 | - |
359
+ | 0.7220 | 4650 | 0.031 | - |
360
+ | 0.7298 | 4700 | 0.0268 | - |
361
+ | 0.7376 | 4750 | 0.0324 | - |
362
+ | 0.7453 | 4800 | 0.041 | - |
363
+ | 0.7531 | 4850 | 0.0349 | - |
364
+ | 0.7609 | 4900 | 0.0349 | - |
365
+ | 0.7686 | 4950 | 0.0291 | - |
366
+ | 0.7764 | 5000 | 0.025 | - |
367
+ | 0.7842 | 5050 | 0.0249 | - |
368
+ | 0.7919 | 5100 | 0.0272 | - |
369
+ | 0.7997 | 5150 | 0.0302 | - |
370
+ | 0.8075 | 5200 | 0.0414 | - |
371
+ | 0.8152 | 5250 | 0.0295 | - |
372
+ | 0.8230 | 5300 | 0.033 | - |
373
+ | 0.8307 | 5350 | 0.0203 | - |
374
+ | 0.8385 | 5400 | 0.0275 | - |
375
+ | 0.8463 | 5450 | 0.0354 | - |
376
+ | 0.8540 | 5500 | 0.0254 | - |
377
+ | 0.8618 | 5550 | 0.0313 | - |
378
+ | 0.8696 | 5600 | 0.0296 | - |
379
+ | 0.8773 | 5650 | 0.0248 | - |
380
+ | 0.8851 | 5700 | 0.036 | - |
381
+ | 0.8929 | 5750 | 0.025 | - |
382
+ | 0.9006 | 5800 | 0.0234 | - |
383
+ | 0.9084 | 5850 | 0.0221 | - |
384
+ | 0.9161 | 5900 | 0.0314 | - |
385
+ | 0.9239 | 5950 | 0.0273 | - |
386
+ | 0.9317 | 6000 | 0.0299 | - |
387
+ | 0.9394 | 6050 | 0.0262 | - |
388
+ | 0.9472 | 6100 | 0.0285 | - |
389
+ | 0.9550 | 6150 | 0.021 | - |
390
+ | 0.9627 | 6200 | 0.0215 | - |
391
+ | 0.9705 | 6250 | 0.0312 | - |
392
+ | 0.9783 | 6300 | 0.0259 | - |
393
+ | 0.9860 | 6350 | 0.0234 | - |
394
+ | 0.9938 | 6400 | 0.0222 | - |
395
+ | 1.0 | 6440 | - | 0.1609 |
396
+
397
+ ### Framework Versions
398
+ - Python: 3.10.14
399
+ - SetFit: 1.1.0
400
+ - Sentence Transformers: 3.3.1
401
+ - Transformers: 4.45.2
402
+ - PyTorch: 2.4.0
403
+ - Datasets: 3.0.1
404
+ - Tokenizers: 0.20.0
405
+
406
+ ## Citation
407
+
408
+ ### BibTeX
409
+ ```bibtex
410
+ @article{https://doi.org/10.48550/arxiv.2209.11055,
411
+ doi = {10.48550/ARXIV.2209.11055},
412
+ url = {https://arxiv.org/abs/2209.11055},
413
+ author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
414
+ keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
415
+ title = {Efficient Few-Shot Learning Without Prompts},
416
+ publisher = {arXiv},
417
+ year = {2022},
418
+ copyright = {Creative Commons Attribution 4.0 International}
419
+ }
420
+ ```
421
+
422
+ <!--
423
+ ## Glossary
424
+
425
+ *Clearly define terms in order to be accessible across audiences.*
426
+ -->
427
+
428
+ <!--
429
+ ## Model Card Authors
430
+
431
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
432
+ -->
433
+
434
+ <!--
435
+ ## Model Card Contact
436
+
437
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
438
+ -->
added_tokens.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "[MASK]": 128000
3
+ }
config.json ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "microsoft/deberta-v3-base",
3
+ "architectures": [
4
+ "DebertaV2Model"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "hidden_act": "gelu",
8
+ "hidden_dropout_prob": 0.1,
9
+ "hidden_size": 768,
10
+ "initializer_range": 0.02,
11
+ "intermediate_size": 3072,
12
+ "layer_norm_eps": 1e-07,
13
+ "max_position_embeddings": 512,
14
+ "max_relative_positions": -1,
15
+ "model_type": "deberta-v2",
16
+ "norm_rel_ebd": "layer_norm",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 12,
19
+ "pad_token_id": 0,
20
+ "pooler_dropout": 0,
21
+ "pooler_hidden_act": "gelu",
22
+ "pooler_hidden_size": 768,
23
+ "pos_att_type": [
24
+ "p2c",
25
+ "c2p"
26
+ ],
27
+ "position_biased_input": false,
28
+ "position_buckets": 256,
29
+ "relative_attention": true,
30
+ "share_att_key": true,
31
+ "torch_dtype": "float32",
32
+ "transformers_version": "4.45.2",
33
+ "type_vocab_size": 0,
34
+ "vocab_size": 128100
35
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.3.1",
4
+ "transformers": "4.45.2",
5
+ "pytorch": "2.4.0"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
config_setfit.json ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "normalize_embeddings": false,
3
+ "labels": [
4
+ "Computer Science",
5
+ "Physics",
6
+ "Mathematics",
7
+ "Statistics",
8
+ "Quantitative Biology",
9
+ "Quantitative Finance"
10
+ ]
11
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:84fb4ee9b24426df122bb28fbf599b031f8281c756452a3cd7ee40d77a1d353e
3
+ size 735348840
model_head.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fa6fd35b630030ecad96212eec905b8de3b24a220adeb2770f440c771cc5d8e8
3
+ size 20006
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "[CLS]",
3
+ "cls_token": "[CLS]",
4
+ "eos_token": "[SEP]",
5
+ "mask_token": "[MASK]",
6
+ "pad_token": "[PAD]",
7
+ "sep_token": "[SEP]",
8
+ "unk_token": {
9
+ "content": "[UNK]",
10
+ "lstrip": false,
11
+ "normalized": true,
12
+ "rstrip": false,
13
+ "single_word": false
14
+ }
15
+ }
spm.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c679fbf93643d19aab7ee10c0b99e460bdbc02fedf34b92b05af343b4af586fd
3
+ size 2464616
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "[CLS]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "[SEP]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "3": {
28
+ "content": "[UNK]",
29
+ "lstrip": false,
30
+ "normalized": true,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "128000": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "bos_token": "[CLS]",
45
+ "clean_up_tokenization_spaces": false,
46
+ "cls_token": "[CLS]",
47
+ "do_lower_case": false,
48
+ "eos_token": "[SEP]",
49
+ "mask_token": "[MASK]",
50
+ "model_max_length": 1000000000000000019884624838656,
51
+ "pad_token": "[PAD]",
52
+ "sep_token": "[SEP]",
53
+ "sp_model_kwargs": {},
54
+ "split_by_punct": false,
55
+ "tokenizer_class": "DebertaV2Tokenizer",
56
+ "unk_token": "[UNK]",
57
+ "vocab_type": "spm"
58
+ }