helinivan commited on
Commit
9780bac
1 Parent(s): e0ce3cc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -5,14 +5,14 @@ tags:
5
  - sarcasm-detection
6
  - text-classification
7
  widget:
8
- - text: "Auto, stop a diesel e benzina dal 2035. Ecco cosa cambia per i consumatori"
9
  - text: "CIA Realizes It's Been Using Black Highlighters All These Years."
10
  - text: "We deden een man een nacht in een vat met cola en nu is hij dood"
11
  ---
12
 
13
  # Multilingual Sarcasm Detector
14
 
15
- Multilingual Sarcasm Detector is a text classification model built to detect sarcasm from news article titles. It is fine-tuned on [bert-multilingual-uncased](https://huggingface.co/bert-base-multilingual-uncased) and the training data consists of ready-made datasets available on Kaggle as well scraped data from multiple newspapers in English, Dutch and Italian.
16
 
17
 
18
  <b>Labels</b>:
@@ -53,7 +53,7 @@ tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH)
53
  model = AutoModelForSequenceClassification.from_pretrained(MODEL_PATH)
54
 
55
  text = "CIA Realizes It's Been Using Black Highlighters All These Years."
56
- tokenized_text = tokenizer([preprocess_data(text)], padding=True, truncation=True, max_length=512, return_tensors="pt")
57
  output = model(**tokenized_text)
58
  probs = output.logits.softmax(dim=-1).tolist()[0]
59
  confidence = max(probs)
@@ -65,7 +65,7 @@ results = {"is_sarcastic": prediction, "confidence": confidence}
65
  Output:
66
 
67
  ```
68
- {'is_sarcastic': 1, 'confidence': 0.9999909400939941}
69
  ```
70
 
71
  ## Performance
 
5
  - sarcasm-detection
6
  - text-classification
7
  widget:
8
+ - text: "Gli Usa a un passo dalla recessione"
9
  - text: "CIA Realizes It's Been Using Black Highlighters All These Years."
10
  - text: "We deden een man een nacht in een vat met cola en nu is hij dood"
11
  ---
12
 
13
  # Multilingual Sarcasm Detector
14
 
15
+ Multilingual Sarcasm Detector is a text classification model built to detect sarcasm from news article titles. It is fine-tuned on [bert-base-multilingual-uncased](https://huggingface.co/bert-base-multilingual-uncased) and the training data consists of ready-made datasets available on Kaggle as well scraped data from multiple newspapers in English, Dutch and Italian.
16
 
17
 
18
  <b>Labels</b>:
 
53
  model = AutoModelForSequenceClassification.from_pretrained(MODEL_PATH)
54
 
55
  text = "CIA Realizes It's Been Using Black Highlighters All These Years."
56
+ tokenized_text = tokenizer([preprocess_data(text)], padding=True, truncation=True, max_length=256, return_tensors="pt")
57
  output = model(**tokenized_text)
58
  probs = output.logits.softmax(dim=-1).tolist()[0]
59
  confidence = max(probs)
 
65
  Output:
66
 
67
  ```
68
+ {'is_sarcastic': 1, 'confidence': 0.9374828934669495}
69
  ```
70
 
71
  ## Performance