File size: 1,210 Bytes
28a0c70
 
 
 
 
 
 
 
 
ae07e55
28a0c70
 
 
 
 
 
 
 
e2a08d0
 
28a0c70
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
---
pipeline_tag: feature-extraction
---
# Style Transformer for Authorship Representations - STAR
This is the repository for the [Style Transformer for Authorship Representations (STAR)](https://arxiv.org/abs/2310.11081) model. We present the weights of our model here.

Also check out our [github repo for STAR](https://github.com/jahuerta92/star) for replication.

## Feature extraction
```python
tokenizer = AutoTokenizer.from_pretrained('roberta-large')
model = AutoModel.from_pretrained('AIDA-UPM/star')

examples = ['My text 1', 'This is another text']

def extract_embeddings(texts):
  encoded_texts = tokenizer(texts)
  with torch.no_grad():
    style_embeddings = model(encoded_texts.input_ids,
                             attention_mask=encoded_texts.attention_mask).pooler_output
  return style_embeddings

print(extract_embeddings(examples))
```

## Citation
```
@article{Huertas-Tato2023Oct,
	author = {Huertas-Tato, Javier and Martin, Alejandro and Camacho, David},
	title = {{Understanding writing style in social media with a supervised contrastively pre-trained transformer}},
	journal = {arXiv},
	year = {2023},
	month = oct,
	eprint = {2310.11081},
	doi = {10.48550/arXiv.2310.11081}
}
```