mrm8488 commited on
Commit
5811a2a
1 Parent(s): a753872

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -4
README.md CHANGED
@@ -17,15 +17,17 @@ widget:
17
  [dbmdz/bert-base-turkish-cased](https://huggingface.co/dbmdz/bert-base-turkish-cased) (BERT Checkpoint)
18
 
19
  ## Dataset
 
 
20
  [MLSUM tu/tr](https://huggingface.co/datasets/viewer/?dataset=mlsum)
21
 
22
  ## Results
23
 
24
  |Set|Metric| Value|
25
  |----|------|------|
26
- | Test |Rouge2 - mid -precision | 32.41|
27
- | Test | Rouge2 - mid - recall | 28.65|
28
- | Test | Rouge2 - mid - fmeasure | 29.48|
29
 
30
  ## Usage
31
 
@@ -45,7 +47,10 @@ def generate_summary(text):
45
  output = model.generate(input_ids, attention_mask=attention_mask)
46
  return tokenizer.decode(output[0], skip_special_tokens=True)
47
 
48
-
49
  text = "Your text here..."
50
  generate_summary(text)
 
 
 
 
51
 
 
17
  [dbmdz/bert-base-turkish-cased](https://huggingface.co/dbmdz/bert-base-turkish-cased) (BERT Checkpoint)
18
 
19
  ## Dataset
20
+ **MLSUM** is the first large-scale MultiLingual SUMmarization dataset. Obtained from online newspapers, it contains 1.5M+ article/summary pairs in five different languages -- namely, French, German, Spanish, Russian, **Turkish**. Together with English newspapers from the popular CNN/Daily mail dataset, the collected data form a large scale multilingual dataset which can enable new research directions for the text summarization community. We report cross-lingual comparative analyses based on state-of-the-art systems. These highlight existing biases which motivate the use of a multi-lingual dataset.
21
+
22
  [MLSUM tu/tr](https://huggingface.co/datasets/viewer/?dataset=mlsum)
23
 
24
  ## Results
25
 
26
  |Set|Metric| Value|
27
  |----|------|------|
28
+ | Test |Rouge2 - mid -precision | **32.41**|
29
+ | Test | Rouge2 - mid - recall | **28.65**|
30
+ | Test | Rouge2 - mid - fmeasure | **29.48**|
31
 
32
  ## Usage
33
 
 
47
  output = model.generate(input_ids, attention_mask=attention_mask)
48
  return tokenizer.decode(output[0], skip_special_tokens=True)
49
 
 
50
  text = "Your text here..."
51
  generate_summary(text)
52
+ ```
53
+
54
+ > Created by [Manuel Romero/@mrm8488](https://twitter.com/mrm8488) with the support of [Narrativa](https://www.narrativa.com/)
55
+ > Made with <span style="color: #e25555;">&hearts;</span> in Spain
56