sreevishnu-damodaran
commited on
Commit
•
73e253b
1
Parent(s):
d30b4b6
Release commit
Browse files- README.md +39 -0
- config.json +41 -0
- pytorch_model.bin +3 -0
- special_tokens_map.json +1 -0
- tokenizer_config.json +1 -0
- vocab.txt +0 -0
README.md
CHANGED
@@ -1,3 +1,42 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
+
language: en
|
4 |
+
datasets:
|
5 |
+
- imdb
|
6 |
+
tags:
|
7 |
+
- sentiment-analysis
|
8 |
---
|
9 |
+
|
10 |
+
# Funnel Transformer small (B4-4-4 with decoder) fine-tuned on IMDB for Sentiment Analysis
|
11 |
+
|
12 |
+
This are the model weights for the Funnel Transformer small model fine-tuned on the IMDB dataset for performing Sentiment Analysis.
|
13 |
+
|
14 |
+
The original model weights for English language are from [funnel-transformer/small](https://huggingface.co/funnel-transformer/small) and it uses a similar objective objective as [ELECTRA](https://huggingface.co/transformers/model_doc/electra.html). It was introduced in [this paper](https://arxiv.org/pdf/2006.03236.pdf) and first released in [this repository](https://github.com/laiguokun/Funnel-Transformer). This model is uncased: it does not make a difference between english and English.
|
15 |
+
|
16 |
+
## Fine-tuning Results
|
17 |
+
|
18 |
+
| | Accuracy | Precision | Recall | F1 |
|
19 |
+
|-------------------------------|----------|-----------|----------|----------|
|
20 |
+
| funnel-transformer-small-imdb | 0.956530 | 0.952286 | 0.961075 | 0.956661 |
|
21 |
+
|
22 |
+
|
23 |
+
## Model description (from [funnel-transformer/small](https://huggingface.co/funnel-transformer/small))
|
24 |
+
|
25 |
+
Funnel Transformer is a transformers model pretrained on a large corpus of English data in a self-supervised fashion. This means it was pretrained on the raw texts only, with no humans labelling them in any way (which is why it can use lots of publicly available data) with an automatic process to generate inputs and labels from those texts.
|
26 |
+
|
27 |
+
More precisely, a small language model corrupts the input texts and serves as a generator of inputs for this model, and the pretraining objective is to predict which token is an original and which one has been replaced, a bit like a GAN training.
|
28 |
+
|
29 |
+
This way, the model learns an inner representation of the English language that can then be used to extract features useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a standard classifier using the features produced by the BERT model as inputs.
|
30 |
+
|
31 |
+
# How to use
|
32 |
+
|
33 |
+
Here is how to use this model to get the features of a given text in PyTorch:
|
34 |
+
|
35 |
+
```python
|
36 |
+
from transformers import FunnelTokenizer, FunnelModel
|
37 |
+
tokenizer = FunnelTokenizer.from_pretrained("Sreevishnu/funnel-transformer-small-imdb")
|
38 |
+
model = FunneModel.from_pretrained("Sreevishnu/funnel-transformer-small-imdb")
|
39 |
+
text = "Replace me by any text you'd like."
|
40 |
+
encoded_input = tokenizer(text, return_tensors='pt')
|
41 |
+
output = model(**encoded_input)
|
42 |
+
```
|
config.json
ADDED
@@ -0,0 +1,41 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"_name_or_path": "funnel-transformer/small",
|
3 |
+
"activation_dropout": 0.0,
|
4 |
+
"architectures": [
|
5 |
+
"FunnelForSequenceClassification"
|
6 |
+
],
|
7 |
+
"attention_dropout": 0.1,
|
8 |
+
"attention_type": "relative_shift",
|
9 |
+
"block_repeats": [
|
10 |
+
1,
|
11 |
+
1,
|
12 |
+
1
|
13 |
+
],
|
14 |
+
"block_sizes": [
|
15 |
+
4,
|
16 |
+
4,
|
17 |
+
4
|
18 |
+
],
|
19 |
+
"d_head": 64,
|
20 |
+
"d_inner": 3072,
|
21 |
+
"d_model": 768,
|
22 |
+
"hidden_act": "gelu_new",
|
23 |
+
"hidden_dropout": 0.1,
|
24 |
+
"initializer_range": 0.1,
|
25 |
+
"initializer_std": null,
|
26 |
+
"layer_norm_eps": 1e-09,
|
27 |
+
"max_position_embeddings": 1024,
|
28 |
+
"model_type": "funnel",
|
29 |
+
"n_head": 12,
|
30 |
+
"num_decoder_layers": 2,
|
31 |
+
"pool_q_only": true,
|
32 |
+
"pooling_type": "mean",
|
33 |
+
"problem_type": "single_label_classification",
|
34 |
+
"rel_attn_type": "factorized",
|
35 |
+
"separate_cls": true,
|
36 |
+
"torch_dtype": "float32",
|
37 |
+
"transformers_version": "4.19.1",
|
38 |
+
"truncate_seq": true,
|
39 |
+
"type_vocab_size": 3,
|
40 |
+
"vocab_size": 30522
|
41 |
+
}
|
pytorch_model.bin
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:f6ed0e05adc7e525e5c6991aafd782eb1c15d69fbe898a8c10602fae401cf0d7
|
3 |
+
size 464897555
|
special_tokens_map.json
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
{"bos_token": "<s>", "eos_token": "</s>", "unk_token": "<unk>", "sep_token": "<sep>", "pad_token": "<pad>", "cls_token": "<cls>", "mask_token": "<mask>"}
|
tokenizer_config.json
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
{"do_lower_case": true, "unk_token": "<unk>", "sep_token": "<sep>", "pad_token": "<pad>", "cls_token": "<cls>", "mask_token": "<mask>", "tokenize_chinese_chars": true, "strip_accents": null, "bos_token": "<s>", "eos_token": "</s>", "clean_text": true, "wordpieces_prefix": "##", "model_max_length": 512, "special_tokens_map_file": "/root/.cache/huggingface/transformers/42f288d1012e9318fb0218e3b44279d7e800ae4e86c78156326cedfa89c6c121.34a22f495fc6b4fddbf5d6b2c62637ae42a7204b6355bbd999c44fee4001336d", "name_or_path": "funnel-transformer/small", "tokenizer_class": "FunnelTokenizer"}
|
vocab.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|