TomatenMarc commited on
Commit
03fdf2c
1 Parent(s): 6df7c75

Add new SentenceTransformer model.

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ pytorch_model.bin filter=lfs diff=lfs merge=lfs -text
1_Pooling/config.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false
7
+ }
README.md ADDED
@@ -0,0 +1,227 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: sentence-similarity
3
+ tags:
4
+ - sentence-transformers
5
+ - feature-extraction
6
+ - sentence-similarity
7
+ - transformers
8
+ license: cc-by-sa-4.0
9
+ language:
10
+ - en
11
+ widget:
12
+ - source_sentence: "The formula: Not everyone who voted Leave is racist. But everyone who's racist voted Leave. Not everyone who voted Leave is thick. But everyone who's thick voted Leave. The thick racists therefore called the shots, whatever the thoughts of the minority of others. #thick #Brexit"
13
+ sentences:
14
+ - "Men shouldn’t be making laws about women’s bodies #abortion #Texas"
15
+ - "Opinion: As the draconian (and then some) abortion law takes effecting #Texas, this is not an idle question for millions of Americans. A slippery slope towards more like-minded Republican state-legislatures to try to follow suit. #abortion #F24 HTTPURL"
16
+ - "’Bitter truth’: EU chief pours cold water on idea of Brits keeping EU citizenship after #Brexit HTTPURL via @USER"
17
+ - "@USER Blah blah blah blah blah blah"
18
+ example_title: "Reason"
19
+
20
+ - source_sentence: "This is NOT good for children."
21
+ sentences:
22
+ - "Men shouldn’t be making laws about women’s bodies #abortion #Texas"
23
+ - "Opinion: As the draconian (and then some) abortion law takes effecting #Texas, this is not an idle question for millions of Americans. A slippery slope towards more like-minded Republican state-legislatures to try to follow suit. #abortion #F24 HTTPURL"
24
+ - "’Bitter truth’: EU chief pours cold water on idea of Brits keeping EU citizenship after #Brexit HTTPURL via @USER"
25
+ - "@USER Blah blah blah blah blah blah"
26
+ example_title: "Statement"
27
+
28
+ - source_sentence: "Elon Musk ready with 'Plan B' if Twitter rejects his offer Read @USER Story | HTTPURL #ElonMusk #ElonMuskTwitter #TwitterTakeover HTTPURL"
29
+ sentences:
30
+ - "Men shouldn’t be making laws about women’s bodies #abortion #Texas"
31
+ - "Opinion: As the draconian (and then some) abortion law takes effecting #Texas, this is not an idle question for millions of Americans. A slippery slope towards more like-minded Republican state-legislatures to try to follow suit. #abortion #F24 HTTPURL"
32
+ - "’Bitter truth’: EU chief pours cold water on idea of Brits keeping EU citizenship after #Brexit HTTPURL via @USER"
33
+ - "@USER Blah blah blah blah blah blah"
34
+ example_title: "Notification"
35
+
36
+ - source_sentence: "@USER 👅is the Key 😂"
37
+ sentences:
38
+ - "Men shouldn’t be making laws about women’s bodies #abortion #Texas"
39
+ - "Opinion: As the draconian (and then some) abortion law takes effecting #Texas, this is not an idle question for millions of Americans. A slippery slope towards more like-minded Republican state-legislatures to try to follow suit. #abortion #F24 HTTPURL"
40
+ - "’Bitter truth’: EU chief pours cold water on idea of Brits keeping EU citizenship after #Brexit HTTPURL via @USER"
41
+ - "@USER Blah blah blah blah blah blah"
42
+ example_title: "None"
43
+ ---
44
+
45
+ # WRAPresentations
46
+
47
+ Introducing WRAPresentations, a cutting-edge [sentence-transformers](https://www.SBERT.net) model that leverages the power of a 768-dimensional dense
48
+ vector space to map tweets according to the four classes Reason, Statement, Notification and None. This powerful model is tailored for
49
+ argument mining on Twitter, derived from the [BERTweet-base](https://huggingface.co/vinai/bertweet-base) architecture initially pre-trained on
50
+ Twitter data. Through fine-tuning with the [TACO](https://doi.org/10.5281/zenodo.8030026) dataset, WRAPresentations is effectively in Weaving
51
+ Relevant Argument Properties (WRAP) into the embedding space.
52
+
53
+ ## Class Semantics
54
+
55
+ The model, to some degree, captures the semantics of the critical components of an argument, as defined by the [Cambridge Dictionary](https://dictionary.cambridge.org).
56
+ It encodes *inference* as *a guess that one makes or an opinion formed based on available information*, and it also leverages the definition of
57
+ *information* as *facts or details about a person, company, product, etc.*.
58
+
59
+ Consequently, it has also learned the semantics of:
60
+
61
+ * *Statement*, which refers to unique cases where only the *inference* is presented as *something that someone says or writes officially, or an
62
+ action done to express an opinion* (see ex. 1).
63
+ * *Reason*, which represents a full argument where the *inference* is based on direct *information* mentioned in the tweet, such as a
64
+ source-reference or quotation, and thus reveals the author’s motivation *to try to understand and to make judgments based on practical facts* (see ex. 2).
65
+ * *Notification*, which refers to a tweet that limits itself to providing *information*, such as media channels promoting their latest articles (see ex. 3).
66
+ * *None*, a tweet that provides neither *inference* nor *information* (see ex. 4).
67
+
68
+ In its entirety, WRAPresentations encodes the following hierarchy for tweets:
69
+ ![Beautiful Sunset](https://www.researchgate.net/profile/Marc-Feger/publication/371595900/figure/fig1/AS:11431281168142295@1686846469455/Hierarchy-of-arguments-with-constituting-elements_W640.jpg)
70
+
71
+
72
+ ## Usage (Sentence-Transformers)
73
+
74
+ Using this model becomes easy when you have [sentence-transformers](https://www.SBERT.net) installed:
75
+
76
+ ```
77
+ pip install -U sentence-transformers
78
+ ```
79
+
80
+ Then you can use the model to generate tweet representations like this:
81
+
82
+ ```python
83
+ from sentence_transformers import SentenceTransformer
84
+
85
+ tweets = ["This is an example #tweet", "Each tweet is converted"]
86
+
87
+ model = SentenceTransformer("TomatenMarc/WRAPresentations")
88
+ embeddings = model.encode(tweets)
89
+ print(embeddings)
90
+ ```
91
+
92
+ <a href="https://github.com/VinAIResearch/BERTweet/blob/master/TweetNormalizer.py">
93
+ <blockquote style="border-left: 5px solid grey; background-color: #f0f5ff; padding: 10px;">
94
+ Notice: The tweets need to undergo preprocessing following the specifications for BERTweet-base.
95
+ </blockquote>
96
+ </a>
97
+
98
+ ## Usage (HuggingFace Transformers)
99
+
100
+ Without [sentence-transformers](https://www.SBERT.net), you can use the model like this: First, you pass your input through the transformer model,
101
+ then you have to apply the right pooling-operation on-top of the contextualized word embeddings.
102
+
103
+ ```python
104
+ from transformers import AutoTokenizer, AutoModel
105
+ import torch
106
+
107
+
108
+ # Mean Pooling - Take attention mask into account for correct averaging
109
+ def mean_pooling(model_output, attention_mask):
110
+ token_embeddings = model_output[0] # First element of model_output contains all token embeddings
111
+ input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
112
+ return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
113
+
114
+
115
+ # Tweets we want embeddings for
116
+ tweets = ["This is an example #tweet", "Each tweet is converted"]
117
+
118
+ # Load model from HuggingFace Hub
119
+ tokenizer = AutoTokenizer.from_pretrained("TomatenMarc/WRAPresentations")
120
+ model = AutoModel.from_pretrained("TomatenMarc/WRAPresentations")
121
+
122
+ # Tokenize sentences
123
+ encoded_input = tokenizer(tweets, padding=True, truncation=True, return_tensors="pt")
124
+
125
+ # Compute token embeddings
126
+ with torch.no_grad():
127
+ model_output = model(**encoded_input)
128
+
129
+ # Perform pooling. In this case, mean pooling.
130
+ sentence_embeddings = mean_pooling(model_output, encoded_input["attention_mask"])
131
+
132
+ print("Sentence embeddings:")
133
+ print(sentence_embeddings)
134
+ ```
135
+
136
+ Furthermore, the WRAPresentations model is a highly suitable embedding component for `AutoModelForSequenceClassification`, enabling fine-tuning of
137
+ tweet classification tasks specifically for the four classes: Reason, Statement, Notification, and None. The categorization of Reason and Statement as
138
+ argument classes and Notification and None as non-argument classes is implicitly learned during the fine-tuning process. This setup facilitates
139
+ efficient identification and analysis of argumentative content and non-argumentative content in tweets.
140
+
141
+ ## Training
142
+
143
+ The WRAPresentations model underwent fine-tuning with 1,219 golden tweets from the TACO dataset, covering six topics.
144
+ Five topics were chosen for optimization, representing 925 tweets (75.88%) covering #brexit (33.3%), #got (17%), #lotrrop (18.8%), #squidgame (17.1%),
145
+ and #twittertakeover (13.8%). The model used a stratified 60/40 split for training/testing on optimization data.
146
+ Additionally, 294 golden tweets (24.12%) related to the topic of abortion were chosen as the holdout-set for final evaluation.
147
+
148
+ During fine-tuning, we formed tweet pairs by matching each tweet with all remaining tweets in the same data split (training, testing, holdout) with
149
+ similar or dissimilar class labels. This process created 307,470 pairs for training and 136,530 pairs for testing. An additional 86,142 pairs were
150
+ used for final evaluation with the holdout data.
151
+
152
+ The model was trained with the parameters:
153
+
154
+ **DataLoader**:
155
+
156
+ `torch.utils.data.dataloader.DataLoader` of length 5065 with parameters:
157
+
158
+ ```
159
+ {'batch_size': 32, 'sampler': 'torch.utils.data.sampler.RandomSampler', 'batch_sampler': 'torch.utils.data.sampler.BatchSampler'}
160
+ ```
161
+
162
+ **Loss**:
163
+
164
+ `sentence_transformers.losses.ContrastiveLoss.ContrastiveLoss` with parameters:
165
+
166
+ ```
167
+ {'distance_metric': 'SiameseDistanceMetric.COSINE_DISTANCE', 'margin': 0.5, 'size_average': True}
168
+ ```
169
+
170
+ Parameters of the fit()-Method:
171
+
172
+ ```
173
+ {
174
+ "epochs": 5,
175
+ "evaluation_steps": 1000,
176
+ "evaluator": "sentence_transformers.evaluation.BinaryClassificationEvaluator.BinaryClassificationEvaluator",
177
+ "max_grad_norm": 1,
178
+ "optimizer_class": "<class 'torch.optim.adamw.AdamW'>",
179
+ "optimizer_params": {
180
+ "lr": 4e-05
181
+ },
182
+ "scheduler": "WarmupLinear",
183
+ "steps_per_epoch": null,
184
+ "warmup_steps": 2533,
185
+ "weight_decay": 0.01
186
+ }
187
+ ```
188
+
189
+ ## Evaluation Results
190
+
191
+ Following the [standard protocol](https://aclanthology.org/D17-1218.pdf) for cross-topic evaluation for argument mining, we evaluated the
192
+ WRAPresentation model using the
193
+ `BinaryClassificationEvaluator` showing:
194
+
195
+ | Accuracy | Precision | Recall | F1 | Support |
196
+ |----------|-----------|--------|--------|---------|
197
+ | 71.56% | 65.70% | 83.20% | 73.42% | 86,142 |
198
+
199
+ An evaluation was conducted on previously unseen data from the holdout topic abortion, resulting in the model achieving a sophisticated macro
200
+ F1-score of 71.56%. The recall, which stands at 87.54%, indicates the model's ability to capture subtle tweet patterns and class-specific features for
201
+ Reason, Statement, Notification, and None. Despite having a lower precision of 65.70%, the model's primary focus is on prioritizing recall to capture
202
+ relevant instances. Fine-tuning precision can be addressed in a subsequent classification phase, when using this model
203
+ for `AutoModelForSequenceClassification`.
204
+
205
+ ## Full Model Architecture
206
+
207
+ ```
208
+ SentenceTransformer(
209
+ (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: RobertaModel
210
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
211
+ )
212
+ ```
213
+
214
+ # Environmental Impact
215
+
216
+ - **Hardware Type:** A100 PCIe 40GB
217
+ - **Hours used:** 2h
218
+ - **Cloud Provider:** [Google Cloud Platform](https://colab.research.google.com)
219
+ - **Compute Region:** [asia-southeast1](https://cloud.google.com/compute/docs/gpus/gpu-regions-zones?hl=en) (Singapore)
220
+ - **Carbon Emitted:** 0.21kg CO2
221
+
222
+ ## Licensing
223
+
224
+ [WRAPresentations](https://huggingface.co/TomatenMarc/WRAPresentations) © 2023 by [Marc Feger](marc.feger@uni-duesseldorf.de) is licensed under [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/?ref=chooser-v1)
225
+ ## Contact
226
+
227
+ Please contact [marc.feger@uni-duesseldorf.de](marc.feger@uni-duesseldorf.de).
added_tokens.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "<mask>": 64000
3
+ }
binary_classification_evaluation_fine-tune-heldout_results.csv ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ epoch,steps,cossim_accuracy,cossim_accuracy_threshold,cossim_f1,cossim_precision,cossim_recall,cossim_f1_threshold,cossim_ap,manhattan_accuracy,manhattan_accuracy_threshold,manhattan_f1,manhattan_precision,manhattan_recall,manhattan_f1_threshold,manhattan_ap,euclidean_accuracy,euclidean_accuracy_threshold,euclidean_f1,euclidean_precision,euclidean_recall,euclidean_f1_threshold,euclidean_ap,dot_accuracy,dot_accuracy_threshold,dot_f1,dot_precision,dot_recall,dot_f1_threshold,dot_ap
2
+ -1,-1,0.7155713218820015,0.5635744333267212,0.7342164228285225,0.6569760584974643,0.8320388349514564,0.4611433148384094,0.7701340554755384,0.7225914861837192,166.821533203125,0.7399493229612176,0.651601716835792,0.8560119492158327,203.3678741455078,0.774587176530623,0.7169716206123973,8.298559188842773,0.7285230528783538,0.6809284207109235,0.7832710978342047,9.337143898010254,0.7708316235108152,0.7108849887976102,57.948116302490234,0.7276492693110648,0.6582263181749509,0.8134428678117999,41.91051483154297,0.7366228479917187
bpe.codes ADDED
The diff for this file is too large to render. See raw diff
 
config.json ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "./models/WRAPresentations/",
3
+ "architectures": [
4
+ "RobertaModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "classifier_dropout": null,
9
+ "eos_token_id": 2,
10
+ "gradient_checkpointing": false,
11
+ "hidden_act": "gelu",
12
+ "hidden_dropout_prob": 0.1,
13
+ "hidden_size": 768,
14
+ "initializer_range": 0.02,
15
+ "intermediate_size": 3072,
16
+ "layer_norm_eps": 1e-05,
17
+ "max_position_embeddings": 130,
18
+ "model_type": "roberta",
19
+ "num_attention_heads": 12,
20
+ "num_hidden_layers": 12,
21
+ "pad_token_id": 1,
22
+ "position_embedding_type": "absolute",
23
+ "tokenizer_class": "BertweetTokenizer",
24
+ "torch_dtype": "float32",
25
+ "transformers_version": "4.31.0",
26
+ "type_vocab_size": 1,
27
+ "use_cache": true,
28
+ "vocab_size": 64001
29
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "2.2.2",
4
+ "transformers": "4.31.0",
5
+ "pytorch": "2.0.1+cu118"
6
+ }
7
+ }
eval/binary_classification_evaluation_fine-tune-test_results.csv ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ epoch,steps,cossim_accuracy,cossim_accuracy_threshold,cossim_f1,cossim_precision,cossim_recall,cossim_f1_threshold,cossim_ap,manhattan_accuracy,manhattan_accuracy_threshold,manhattan_f1,manhattan_precision,manhattan_recall,manhattan_f1_threshold,manhattan_ap,euclidean_accuracy,euclidean_accuracy_threshold,euclidean_f1,euclidean_precision,euclidean_recall,euclidean_f1_threshold,euclidean_ap,dot_accuracy,dot_accuracy_threshold,dot_f1,dot_precision,dot_recall,dot_f1_threshold,dot_ap
2
+ 0,1000,0.7678939453016209,0.5616649389266968,0.7749335063250081,0.725400179746897,0.8317272879184537,0.4328032433986664,0.8487791808297954,0.7691472177351975,175.79025268554688,0.7765562172341834,0.7149149524389673,0.849830111959004,196.30166625976562,0.8516339941615735,0.7659165599064224,8.315374374389648,0.7702681429396022,0.7280888844799365,0.8176349356653484,9.056646347045898,0.8479770494283996,0.7686737592602908,41.60350799560547,0.7689403869344098,0.7262912890605655,0.8169108227037264,32.74320602416992,0.817298703547951
3
+ 0,2000,0.7677825433075252,0.5190510153770447,0.7758532721071993,0.7490407549517785,0.8046566033532,0.5155777335166931,0.8368960901274767,0.7638973987634379,170.4642333984375,0.770553674954282,0.721071922712819,0.8273269091516738,193.61415100097656,0.8367475169414185,0.7642594552442489,7.980064392089844,0.7707439294186635,0.7390104273154774,0.8053250153177742,8.939167976379395,0.8344096422873637,0.7718347908427561,42.945762634277344,0.7775990048809508,0.7556431480751544,0.8008689355539464,42.44209671020508,0.8302310576373584
4
+ 0,3000,0.7739375034813123,0.5127902030944824,0.7809338469536009,0.7488815604468645,0.8158525037598173,0.49300330877304077,0.8449476727696611,0.7766668523366568,200.49417114257812,0.7910268186346171,0.738614674687734,0.8514454408733916,201.90072631835938,0.8522992953865132,0.7769035815741102,9.25938606262207,0.7833858384824435,0.7612665878333991,0.8068289422380661,9.25938606262207,0.8462070606506461,0.7681306745390742,47.408573150634766,0.765128667662561,0.7744493894784891,0.7560296329304295,47.133384704589844,0.8077331359909563
5
+ 0,4000,0.7674901130730241,0.6995731592178345,0.767450599287334,0.7175846276866413,0.8247646632874729,0.4543057382106781,0.8436346872666978,0.7675875898178577,171.60763549804688,0.7672268583405124,0.7126239101875075,0.8308917729627361,204.24862670898438,0.8449665781190565,0.7665988971202584,7.342679977416992,0.7594777948805655,0.7777453441531905,0.7420486826714198,8.261296272277832,0.8416007984609263,0.7677268423104774,52.90253829956055,0.7744300725479566,0.7217516843118383,0.8354035537236116,37.78709030151367,0.794628873897718
6
+ 0,5000,0.7745919901966245,0.6659778356552124,0.7749942891951571,0.7118810034505269,0.8503871219294825,0.45358020067214966,0.8488832109960202,0.7747312426892441,170.88339233398438,0.7777511898114914,0.7440227897039373,0.8146827828218125,199.32559204101562,0.8511201654908707,0.7748983456803876,7.782614707946777,0.7700333358780569,0.7088643178410795,0.8427560853339274,9.623931884765625,0.8480727902398433,0.7755528323956998,58.938602447509766,0.7752517606792602,0.7350657779774832,0.8200857795354537,40.16120529174805,0.7930881406399948
7
+ 0,-1,0.7797304071742884,0.67124342918396,0.7751530612244898,0.7150656563279522,0.8462652481479418,0.45725303888320923,0.8486250473756219,0.776569375591823,163.75259399414062,0.7768396302036017,0.7465208237046167,0.8097253940845541,198.85543823242188,0.8510887216883583,0.7787138639781652,7.571657657623291,0.7697543500511771,0.7119325980972215,0.8377986965966691,9.60276985168457,0.8479597311663336,0.7776694702835181,55.622920989990234,0.7724684702882175,0.7064773120886734,0.8520581518409179,38.716949462890625,0.7918274296961403
8
+ 1,1000,0.7749261961789116,0.566508412361145,0.7798376850675448,0.7295042204327157,0.8376315936055255,0.4021725654602051,0.8511605513699232,0.7762630201080599,171.59909057617188,0.7799883653286795,0.7278342058915777,0.8401938394697265,213.2512969970703,0.8509098800953989,0.7749401214281736,8.299052238464355,0.7766009754023202,0.7426610751048418,0.813791566869047,9.897568702697754,0.849563100342725,0.7748565699326018,53.404930114746094,0.7776896913280311,0.7189125295508274,0.846933660112516,33.58881378173828,0.8062415345332914
9
+ 1,2000,0.7784075084944021,0.6520156264305115,0.7775647551929763,0.7401887979861548,0.8189160585974489,0.4154204726219177,0.8512251193283239,0.7775441430401604,162.30169677734375,0.7795258085187075,0.7347945070387811,0.8300562580070183,212.55264282226562,0.8508408487911324,0.7779619005180193,7.799623012542725,0.7731482731617746,0.7503144654088051,0.7974154737369799,9.854764938354492,0.8498399451414931,0.7787556397259511,54.70463562011719,0.7773618032067335,0.7245163153335259,0.8385228095582911,34.559288024902344,0.8053294570747453
10
+ 1,3000,0.7781568540076867,0.6612838506698608,0.775138972607327,0.7467031407262122,0.8058263242912048,0.4267839193344116,0.8503063748678845,0.7782682560017824,162.04498291015625,0.7756105992315158,0.7420561223191798,0.812343340945803,211.3235321044922,0.8499273517817276,0.7778922742717095,7.748963832855225,0.7710258717046972,0.7550726185390859,0.7876677992536066,9.824663162231445,0.8491794467884947,0.778449284242188,56.12619400024414,0.7765954623855644,0.7415872694100952,0.8150726898011474,36.294673919677734,0.8037081919007742
11
+ 1,4000,0.7214393137637164,0.5044573545455933,0.6962221636964856,0.7334155363748459,0.6626190608811898,0.45556509494781494,0.7796048578772816,0.7197265081044951,180.50286865234375,0.6977414421832725,0.7390816771540714,0.6607809279786108,199.2108917236328,0.7812933138616089,0.7160502422993371,8.91546630859375,0.6965976246485742,0.718189884649512,0.6762658051579123,9.800008773803711,0.7813535500643046,0.721216509775525,42.53407669067383,0.6979591222872268,0.7586725519045283,0.6462429677491227,41.97949981689453,0.7227403664766267
12
+ 1,5000,0.7312287639948755,0.6148228049278259,0.7150222843659879,0.7494504152418173,0.6836183367682281,0.45740455389022827,0.7955877922714433,0.7304768005347295,198.79611206054688,0.7168404934562157,0.7446578631452581,0.6910265693755918,201.54525756835938,0.7955900503582694,0.73104773575447,8.091403007507324,0.7107716434815439,0.7425364077669903,0.6816131008745057,9.69920539855957,0.7944885709600145,0.7335542806216231,44.558616638183594,0.713370354549338,0.7298100454492483,0.6976549880242856,39.405242919921875,0.726060786237057
13
+ 1,-1,0.7382888653706902,0.522275984287262,0.7203785844620153,0.7707202488651874,0.6762101041608645,0.48286569118499756,0.790454845458989,0.7378989583913552,190.44403076171875,0.7190101103351747,0.7725338367516719,0.6724224363616109,195.2378692626953,0.7945392109799041,0.7367431627026124,9.284839630126953,0.7173958488875616,0.7733067216070049,0.6690246755416922,9.304485321044922,0.7916372746891804,0.7394028853116471,44.70606994628906,0.7188546877581985,0.7803398454062164,0.6663510276833955,43.997772216796875,0.7269551575769607
14
+ 2,1000,0.7374254999164485,0.49638476967811584,0.7150868144060933,0.7745225412498375,0.6641229878014816,0.4762795865535736,0.7895789121471601,0.739082604578622,193.4174346923828,0.7187663965487088,0.753928942701645,0.6867375926029076,201.81869506835938,0.7920154306606293,0.7373001726730909,9.189388275146484,0.7162266969235631,0.7394424673784105,0.6944243301955105,9.865802764892578,0.7926948853939454,0.7390965298278839,45.12528991699219,0.7149767272063521,0.7877061268266524,0.6545424163092519,45.09357452392578,0.7248229185549959
15
+ 2,2000,0.7397510165431961,0.49059003591537476,0.7174882469426934,0.7846067378583,0.6609480309697544,0.49059003591537476,0.7893193607026283,0.7402662507658887,187.76220703125,0.7204597431536973,0.7606017458057327,0.6843424497298501,201.83572387695312,0.7917386019095046,0.7384141926140478,9.029869079589844,0.7164570995127929,0.7522053418279834,0.6839525427505152,9.867085456848145,0.793468666491425,0.7383027906199521,44.93938446044922,0.7148524373384624,0.7762548136544613,0.6624519578900462,42.35888671875,0.7258325223417688
16
+ 2,3000,0.7339163371024341,0.5524381399154663,0.7096153846153846,0.7704051673517323,0.6577173731409792,0.4623206853866577,0.7955165384790114,0.7348911045507714,195.80374145507812,0.7160265716422507,0.7561557283117044,0.6799420709630702,200.68942260742188,0.7945675954196826,0.7338467108561243,8.778763771057129,0.7130119479284314,0.7643535334225451,0.6681334595889267,9.822650909423828,0.7944429400411687,0.7333732523812176,44.763912200927734,0.7049359695489359,0.7890978092116612,0.6369966022391801,44.763912200927734,0.7288963131428454
17
+ 2,4000,0.7330390463989306,0.5843002796173096,0.7106344203277415,0.7663756162000194,0.6624519578900462,0.45505326986312866,0.7942846131264139,0.7349468055478193,196.8582763671875,0.71410343906709,0.7640970281940563,0.6702500974767448,200.70700073242188,0.793536220750753,0.7333175513841698,8.655280113220215,0.7148661532819949,0.7551520034708233,0.6786609480309698,9.962120056152344,0.7943300826887122,0.7332897008856458,52.494140625,0.7017251489059654,0.7867820069204152,0.6332646354369743,44.95760726928711,0.7309583504671321
18
+ 2,5000,0.7313262407397092,0.5530322790145874,0.7062788438183144,0.7762797837549222,0.6478582966635102,0.4647907316684723,0.7913764795949331,0.7316186709742104,195.1563720703125,0.7083424599523956,0.74455871066769,0.6754859911992425,201.52902221679688,0.7906407744746664,0.7307413802707069,9.934350967407227,0.7103064894913642,0.7616345786051256,0.66545981173063,9.974112510681152,0.7927391316736574,0.7327326909151674,52.273109436035156,0.702691850318914,0.7886982145952505,0.6335988414192614,45.618186950683594,0.7295537869972409
19
+ 2,-1,0.7312426892441375,0.5550357103347778,0.7061465398212633,0.7756408119729342,0.6480811006517017,0.46514391899108887,0.7913475580426201,0.7318693254609258,195.03797912597656,0.7077027565739934,0.7531031015303397,0.6674650476243524,200.6476593017578,0.7906456213707942,0.7309920347574221,9.92496109008789,0.7107233776480921,0.763473532704302,0.6647913997660558,9.963253021240234,0.7925669002708331,0.7326909151673815,52.270423889160156,0.7026601411125348,0.7883596050580287,0.6337659444104049,45.76286315917969,0.7294773211363561
20
+ 3,1000,0.7606528156854008,0.5222647190093994,0.7531634240709377,0.7448054246888701,0.7617111346293098,0.45683982968330383,0.8206757503089424,0.7610148721662118,183.16404724121094,0.7484159945569745,0.7620587131599458,0.7352531610315824,199.61854553222656,0.8093893305455506,0.7592742160084666,9.634052276611328,0.7524050488157011,0.7728994743179348,0.7329694201526207,9.713886260986328,0.8228600831850887,0.7607502924302345,47.358665466308594,0.7518852833638027,0.7716798592788039,0.7330808221467164,44.067352294921875,0.7666417639308817
21
+ 3,2000,0.733582131120147,0.48442283272743225,0.7355212850581201,0.6818917118660196,0.7983066896897455,0.44973790645599365,0.8050704151130821,0.7298083885701554,201.81997680664062,0.7293821485251464,0.7168289141074597,0.7423828886537069,207.67794799804688,0.7973270462595166,0.7338467108561243,9.741870880126953,0.7349640213947642,0.6812183465297097,0.7979167827104106,10.044559478759766,0.8076746774003403,0.7311591377485657,46.107749938964844,0.735662325546212,0.672056757192546,0.8125661449339944,40.949432373046875,0.7612766515277805
22
+ 3,3000,0.7564891661560742,0.48917582631111145,0.7576617553680882,0.708479751835785,0.8141814738483819,0.433296799659729,0.8155651684987558,0.7578538405837465,193.57858276367188,0.7464613425188873,0.7770278413884537,0.7182086559349412,205.32235717773438,0.8003775028938475,0.7567119701442656,9.687872886657715,0.7545677701850991,0.7205960617349654,0.791901075029243,10.162216186523438,0.8194534741932576,0.7561549601737871,43.90623474121094,0.7536313854538912,0.699818563789152,0.8164095137302958,39.590248107910156,0.7722236131029006
23
+ 3,4000,0.7385255946081435,0.49491021037101746,0.7461200275438319,0.7112345367331482,0.7846042444159751,0.42813336849212646,0.81639852613666,0.7406283072466997,205.55908203125,0.7280786166273001,0.7640146878824969,0.695371247145324,206.00106811523438,0.8012519667482616,0.7399459700328636,9.668192863464355,0.7449076209667629,0.6602255121363402,0.8545089957110232,10.443655014038086,0.8169782426067075,0.7380382108839748,43.70752716064453,0.7412867436077281,0.683714037027453,0.8094468890993148,38.88163375854492,0.7719869405881279
24
+ 3,5000,0.7368963404444939,0.4928392469882965,0.7346872680655826,0.7126364564517396,0.7581462708182476,0.45811888575553894,0.8043990044525555,0.7428424218793517,191.797607421875,0.7264938326316567,0.7735380162949448,0.6848437587032807,193.6221466064453,0.7963762082998681,0.7367292374533504,9.465219497680664,0.738096524887856,0.7167900068230725,0.7607085166824487,9.99488639831543,0.8096146238710167,0.73686848994597,47.051029205322266,0.7248700356433412,0.733456878118318,0.716481925026458,43.94330978393555,0.7609198118503129
25
+ 3,-1,0.7370355929371136,0.49309322237968445,0.7345518311189047,0.7122341677783881,0.7583133738093912,0.4582207500934601,0.8044314208625994,0.7428424218793517,191.74285888671875,0.7265364036773183,0.7740614764424288,0.6845095527209937,193.55892944335938,0.796406941812556,0.7366178354592547,9.466680526733398,0.7386227100292954,0.7167448319228653,0.7618782376204534,9.996031761169434,0.8097048652084639,0.736910265693756,48.42571258544922,0.7253521126760564,0.7337436598848807,0.7171503369910321,43.963096618652344,0.761129156358665
26
+ 4,1000,0.7396396145491004,0.4853218197822571,0.7335013431554459,0.7294007490636704,0.7376483039046399,0.4670202136039734,0.8048413129762076,0.7425917673926363,192.4800567626953,0.7270748459400233,0.7703431296163555,0.688408622514343,194.97476196289062,0.7964549767767972,0.7383445663677379,9.929280281066895,0.7396969404497635,0.7324759018049752,0.747061772405726,9.958412170410156,0.8107329426732655,0.737634378655378,54.005027770996094,0.7272314910626958,0.740509800525389,0.7144209881356877,44.62757110595703,0.7618423960334315
27
+ 4,2000,0.7411296162201303,0.4963013529777527,0.7339127573821732,0.7226781857451404,0.7455021444883864,0.46599581837654114,0.8046479840463622,0.7431070016153289,191.53518676757812,0.7273531493276716,0.7727927815026004,0.686960396591099,194.13693237304688,0.7963775697420389,0.7391800813234557,9.577247619628906,0.7395142578634486,0.7292518480409412,0.7500696262463098,9.973331451416016,0.8109155085919635,0.7393750348131232,54.30863571166992,0.7277907058740569,0.737221063458956,0.7185985629142762,44.826446533203125,0.7611387981164628
28
+ 4,3000,0.7421183089177297,0.5031415820121765,0.732177798748248,0.7296766520067491,0.734696151061104,0.47203877568244934,0.8043563389232249,0.7445830780370969,192.44961547851562,0.7284195563978798,0.7776302478502782,0.6850665626914721,192.44961547851562,0.7962850912010971,0.7404751295048181,9.603111267089844,0.740485187010569,0.7282029188432333,0.7531888820809892,9.98196792602539,0.8114153743268732,0.7393750348131232,55.02947235107422,0.7268186075554168,0.7409722222222223,0.713195566200635,45.50734329223633,0.7597365088009872
29
+ 4,4000,0.7426613936389461,0.5002992153167725,0.7332384699087636,0.7345706618962433,0.7319111012087116,0.47579818964004517,0.8036745730162046,0.7440121428173564,192.37493896484375,0.7282555718020294,0.7752586396654193,0.6866261906088119,193.13973999023438,0.7962240589563083,0.7407118587422715,9.621297836303711,0.7425506067348249,0.7281488421898006,0.7575335598507213,9.983221054077148,0.8113257016732922,0.7400991477747452,52.59965515136719,0.7278388859255841,0.7544305310660172,0.7030579847379268,46.75685119628906,0.7601106825613864
30
+ 4,5000,0.7427449451345179,0.5027592182159424,0.7334051545537417,0.7274769837790442,0.739430735810171,0.47258734703063965,0.8035390497206156,0.7440399933158803,192.66281127929688,0.7285880911251864,0.7753024351924588,0.6871832005792904,192.68093872070312,0.7964492468636172,0.7402662507658887,9.634147644042969,0.7427504124118938,0.7275058088294207,0.7586475797916783,9.984941482543945,0.8113731356681799,0.7408511112348911,53.53934097290039,0.728091314448982,0.7554876770580661,0.702612376761544,47.05971145629883,0.7598655772877093
31
+ 4,-1,0.7427449451345179,0.5027490258216858,0.7334116833736606,0.7282462450918477,0.7386509218515012,0.472969651222229,0.8035469606946056,0.7440539185651424,192.65733337402344,0.7286203638081739,0.7753755263654076,0.6871832005792904,192.67977905273438,0.7964520586007977,0.7402384002673648,9.634243965148926,0.7427612825793428,0.7277829747427502,0.758369074806439,9.984170913696289,0.8113810178372788,0.7408511112348911,53.53996658325195,0.7281280212406748,0.7555023207066927,0.7026680777585919,47.05809020996094,0.7598733766921927
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3ecd82469cb960c9ac4d7a041589441ffbde220cd2b4d5d13db88be8b9e5b462
3
+ size 539666601
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 128,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "<s>",
3
+ "cls_token": "<s>",
4
+ "eos_token": "</s>",
5
+ "mask_token": "<mask>",
6
+ "pad_token": "<pad>",
7
+ "sep_token": "</s>",
8
+ "unk_token": "<unk>"
9
+ }
tokenizer_config.json ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "<s>",
3
+ "clean_up_tokenization_spaces": true,
4
+ "cls_token": "<s>",
5
+ "eos_token": "</s>",
6
+ "mask_token": "<mask>",
7
+ "model_max_length": 128,
8
+ "normalization": false,
9
+ "pad_token": "<pad>",
10
+ "sep_token": "</s>",
11
+ "tokenizer_class": "BertweetTokenizer",
12
+ "unk_token": "<unk>"
13
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff