Update README.md
Browse files
README.md
CHANGED
@@ -1,7 +1,8 @@
|
|
1 |
---
|
2 |
base_model: FacebookAI/roberta-base
|
3 |
datasets:
|
4 |
-
- SynthSTEL/
|
|
|
5 |
library_name: sentence-transformers
|
6 |
pipeline_tag: sentence-similarity
|
7 |
tags:
|
@@ -13,29 +14,47 @@ tags:
|
|
13 |
- sentence-similarity
|
14 |
widget:
|
15 |
- example_title: Example 1
|
16 |
-
source_sentence:
|
17 |
-
|
|
|
18 |
sentences:
|
19 |
-
-
|
20 |
-
|
21 |
-
|
22 |
-
|
|
|
|
|
23 |
- example_title: Example 2
|
24 |
-
source_sentence:
|
25 |
-
|
|
|
26 |
sentences:
|
27 |
-
-
|
28 |
-
|
29 |
-
|
|
|
30 |
- example_title: Example 3
|
31 |
-
source_sentence:
|
|
|
|
|
32 |
sentences:
|
33 |
-
-
|
34 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
35 |
---
|
36 |
# Model Card
|
37 |
|
38 |
-
|
|
|
|
|
|
|
|
|
39 |
|
40 |
## Example Usage
|
41 |
|
@@ -43,10 +62,10 @@ widget:
|
|
43 |
from sentence_transformers import SentenceTransformer
|
44 |
from sentence_transformers.util import cos_sim
|
45 |
|
46 |
-
model = SentenceTransformer('SynthSTEL/
|
47 |
|
48 |
-
input = model.encode(
|
49 |
-
others = model.encode(["
|
50 |
print(cos_sim(input, others))
|
51 |
```
|
52 |
|
|
|
1 |
---
|
2 |
base_model: FacebookAI/roberta-base
|
3 |
datasets:
|
4 |
+
- SynthSTEL/styledistance_training_triplets
|
5 |
+
- StyleDistance/synthstel
|
6 |
library_name: sentence-transformers
|
7 |
pipeline_tag: sentence-similarity
|
8 |
tags:
|
|
|
14 |
- sentence-similarity
|
15 |
widget:
|
16 |
- example_title: Example 1
|
17 |
+
source_sentence: >-
|
18 |
+
Did you hear about the Wales wing? He'll h8 2 withdraw due 2 injuries from
|
19 |
+
future competitions.
|
20 |
sentences:
|
21 |
+
- >-
|
22 |
+
We're raising funds 2 improve our school's storage facilities and add new
|
23 |
+
playground equipment!
|
24 |
+
- >-
|
25 |
+
Did you hear about the Wales wing? He'll hate to withdraw due to injuries
|
26 |
+
from future competitions.
|
27 |
- example_title: Example 2
|
28 |
+
source_sentence: >-
|
29 |
+
You planned the DesignMeets Decades of Design event; you executed it
|
30 |
+
perfectly.
|
31 |
sentences:
|
32 |
+
- We'll find it hard to prove the thief didn't face a real threat!
|
33 |
+
- >-
|
34 |
+
You orchestrated the DesignMeets Decades of Design gathering; you actualized
|
35 |
+
it flawlessly.
|
36 |
- example_title: Example 3
|
37 |
+
source_sentence: >-
|
38 |
+
Did the William Barr maintain a commitment to allow Robert Mueller to finish
|
39 |
+
the inquiry?
|
40 |
sentences:
|
41 |
+
- >-
|
42 |
+
Will the artist be compiling a music album, or will there be a different
|
43 |
+
focus in the future?
|
44 |
+
- >-
|
45 |
+
Did William Barr maintain commitment to allow Robert Mueller to finish
|
46 |
+
inquiry?
|
47 |
+
license: mit
|
48 |
+
language:
|
49 |
+
- en
|
50 |
---
|
51 |
# Model Card
|
52 |
|
53 |
+
StyleDistance is a **style embedding model** that aims to embed texts with similar writing styles closely and different styles far apart, regardless of content. You may find this model useful for stylistic analysis of text, clustering, authorship identfication and verification tasks, and automatic style transfer evaluation.
|
54 |
+
|
55 |
+
## Training Data and Variants of StyleDistance
|
56 |
+
|
57 |
+
StyleDistance was contrastively trained on [SynthSTEL](https://huggingface.co/datasets/StyleDistance/synthstel), a synthetically generated dataset of positive and negative examples of 40 style features being used in text. By utilizing this synthetic dataset, StyleDistance is able to achieve stronger content-independence than other style embeddding models currently available. This particular model was purely trained on synthetic data. For a version that is trained using a combination of the synthetic dataset and a [real dataset that makes use of authorship datasets from Reddit to train style embeddings](https://aclanthology.org/2022.repl4nlp-1.26/), see this other version of [StyleDistance](https://huggingface.co/StyleDistance/styledistance).
|
58 |
|
59 |
## Example Usage
|
60 |
|
|
|
62 |
from sentence_transformers import SentenceTransformer
|
63 |
from sentence_transformers.util import cos_sim
|
64 |
|
65 |
+
model = SentenceTransformer('SynthSTEL/styledistance_synth_only') # Load model
|
66 |
|
67 |
+
input = model.encode("Did you hear about the Wales wing? He'll h8 2 withdraw due 2 injuries from future competitions.")
|
68 |
+
others = model.encode(["We're raising funds 2 improve our school's storage facilities and add new playground equipment!", "Did you hear about the Wales wing? He'll hate to withdraw due to injuries from future competitions."])
|
69 |
print(cos_sim(input, others))
|
70 |
```
|
71 |
|