eduagarcia commited on
Commit
1a411ea
1 Parent(s): de3b367

change base datasets links to the dataset original paper

Browse files
Files changed (1) hide show
  1. tasks_config/pt_config.yaml +18 -18
tasks_config/pt_config.yaml CHANGED
@@ -62,8 +62,8 @@ tasks:
62
  level exam widely applied every year by the Brazilian government to students that
63
  wish to undertake a University degree. This dataset contains 1,430 questions that don't require
64
  image understanding of the exams from 2010 to 2018, 2022 and 2023."
65
- link: https://huggingface.co/datasets/eduagarcia/enem_challenge
66
- sources: ["https://www.ime.usp.br/~ddm/project/enem/", "https://github.com/piresramon/gpt-4-enem", "https://huggingface.co/datasets/maritaca-ai/enem"]
67
  baseline_sources: ["https://www.sejalguem.com/enem", "https://vestibular.brasilescola.uol.com.br/enem/confira-as-medias-e-notas-maximas-e-minimas-do-enem-2020/349732.html"]
68
  bluex:
69
  benchmark: bluex
@@ -81,8 +81,8 @@ tasks:
81
  description: "BLUEX is a multimodal dataset consisting of the two leading
82
  university entrance exams conducted in Brazil: Convest (Unicamp) and Fuvest (USP),
83
  spanning from 2018 to 2024. The benchmark comprises of 724 questions that do not have accompanying images"
84
- link: https://huggingface.co/datasets/eduagarcia-temp/BLUEX_without_images
85
- sources: ["https://github.com/portuguese-benchmark-datasets/bluex", "https://huggingface.co/datasets/portuguese-benchmark-datasets/BLUEX"]
86
  baseline_sources: ["https://www.comvest.unicamp.br/wp-content/uploads/2023/08/Relatorio_F1_2023.pdf", "https://acervo.fuvest.br/fuvest/2018/FUVEST_2018_indice_discriminacao_1_fase_ins.pdf"]
87
  oab_exams:
88
  benchmark: oab_exams
@@ -104,8 +104,8 @@ tasks:
104
  expert_human_baseline: 75.0
105
  description: OAB Exams is a dataset of more than 2,000 questions from the Brazilian Bar
106
  Association's exams, from 2010 to 2018.
107
- link: https://huggingface.co/datasets/eduagarcia/oab_exams
108
- sources: ["https://github.com/legal-nlp/oab-exams"]
109
  baseline_sources: ["http://fgvprojetos.fgv.br/publicacao/exame-de-ordem-em-numeros", "http://fgvprojetos.fgv.br/publicacao/exame-de-ordem-em-numeros-vol2", "http://fgvprojetos.fgv.br/publicacao/exame-de-ordem-em-numeros-vol3"]
110
  assin2_rte:
111
  benchmark: assin2_rte
@@ -124,8 +124,8 @@ tasks:
124
  of Portuguese. Recognising Textual Entailment (RTE), also called Natural Language
125
  Inference (NLI), is the task of predicting if a given text (premise) entails (implies) in
126
  other text (hypothesis)."
127
- link: https://huggingface.co/datasets/eduagarcia/portuguese_benchmark
128
- sources: ["https://sites.google.com/view/assin2/", "https://huggingface.co/datasets/assin2"]
129
  assin2_sts:
130
  benchmark: assin2_sts
131
  col_name: ASSIN2 STS
@@ -139,8 +139,8 @@ tasks:
139
  expert_human_baseline: null
140
  description: "Same as dataset as above. Semantic Textual Similarity (STS)
141
  ‘measures the degree of semantic equivalence between two sentences’."
142
- link: https://huggingface.co/datasets/eduagarcia/portuguese_benchmark
143
- sources: ["https://sites.google.com/view/assin2/", "https://huggingface.co/datasets/assin2"]
144
  faquad_nli:
145
  benchmark: faquad_nli
146
  col_name: FAQUAD NLI
@@ -161,8 +161,8 @@ tasks:
161
  Brazilian higher education system. FaQuAD-NLI is a modified version of the
162
  FaQuAD dataset that repurposes the question answering task as a textual
163
  entailment task between a question and its possible answers."
164
- link: https://huggingface.co/datasets/ruanchaves/faquad-nli
165
- sources: ["https://github.com/liafacom/faquad/"]
166
  hatebr_offensive:
167
  benchmark: hatebr_offensive
168
  col_name: HateBR Offensive
@@ -178,8 +178,8 @@ tasks:
178
  on the web and social media. The HateBR was collected from Brazilian Instagram comments of politicians and manually annotated
179
  by specialists. It is composed of 7,000 documents annotated with a binary classification (offensive
180
  versus non-offensive comments)."
181
- link: https://huggingface.co/datasets/ruanchaves/hatebr
182
- sources: ["https://github.com/franciellevargas/HateBR", "https://huggingface.co/datasets/eduagarcia/portuguese_benchmark"]
183
  portuguese_hate_speech:
184
  benchmark: portuguese_hate_speech
185
  col_name: PT Hate Speech
@@ -192,8 +192,8 @@ tasks:
192
  human_baseline: null
193
  expert_human_baseline: null
194
  description: "Portuguese dataset for hate speech detection composed of 5,668 tweets with binary annotations (i.e. 'hate' vs. 'no-hate')"
195
- link: https://huggingface.co/datasets/eduagarcia/portuguese_benchmark
196
- sources: ["https://github.com/paulafortuna/Portuguese-Hate-Speech-Dataset", "https://huggingface.co/datasets/hate_speech_portuguese"]
197
  tweetsentbr:
198
  benchmark: tweetsentbr
199
  col_name: tweetSentBR
@@ -209,6 +209,6 @@ tasks:
209
  It was labeled by several annotators following steps stablished on the literature for
210
  improving reliability on the task of Sentiment Analysis. Each Tweet was annotated
211
  in one of the three following classes: Positive, Negative, Neutral."
212
- link: https://bitbucket.org/HBrum/tweetsentbr
213
- sources: ["https://bitbucket.org/HBrum/tweetsentbr", "https://arxiv.org/abs/1712.08917"]
214
 
 
62
  level exam widely applied every year by the Brazilian government to students that
63
  wish to undertake a University degree. This dataset contains 1,430 questions that don't require
64
  image understanding of the exams from 2010 to 2018, 2022 and 2023."
65
+ link: https://www.ime.usp.br/~ddm/project/enem/ENEM-GuidingTest.pdf
66
+ sources: ["https://huggingface.co/datasets/eduagarcia/enem_challenge", "https://www.ime.usp.br/~ddm/project/enem/", "https://github.com/piresramon/gpt-4-enem", "https://huggingface.co/datasets/maritaca-ai/enem"]
67
  baseline_sources: ["https://www.sejalguem.com/enem", "https://vestibular.brasilescola.uol.com.br/enem/confira-as-medias-e-notas-maximas-e-minimas-do-enem-2020/349732.html"]
68
  bluex:
69
  benchmark: bluex
 
81
  description: "BLUEX is a multimodal dataset consisting of the two leading
82
  university entrance exams conducted in Brazil: Convest (Unicamp) and Fuvest (USP),
83
  spanning from 2018 to 2024. The benchmark comprises of 724 questions that do not have accompanying images"
84
+ link: https://arxiv.org/abs/2307.05410
85
+ sources: ["https://huggingface.co/datasets/eduagarcia-temp/BLUEX_without_images", "https://github.com/portuguese-benchmark-datasets/bluex", "https://huggingface.co/datasets/portuguese-benchmark-datasets/BLUEX"]
86
  baseline_sources: ["https://www.comvest.unicamp.br/wp-content/uploads/2023/08/Relatorio_F1_2023.pdf", "https://acervo.fuvest.br/fuvest/2018/FUVEST_2018_indice_discriminacao_1_fase_ins.pdf"]
87
  oab_exams:
88
  benchmark: oab_exams
 
104
  expert_human_baseline: 75.0
105
  description: OAB Exams is a dataset of more than 2,000 questions from the Brazilian Bar
106
  Association's exams, from 2010 to 2018.
107
+ link: https://arxiv.org/abs/1712.05128
108
+ sources: ["https://huggingface.co/datasets/eduagarcia/oab_exams", "https://github.com/legal-nlp/oab-exams"]
109
  baseline_sources: ["http://fgvprojetos.fgv.br/publicacao/exame-de-ordem-em-numeros", "http://fgvprojetos.fgv.br/publicacao/exame-de-ordem-em-numeros-vol2", "http://fgvprojetos.fgv.br/publicacao/exame-de-ordem-em-numeros-vol3"]
110
  assin2_rte:
111
  benchmark: assin2_rte
 
124
  of Portuguese. Recognising Textual Entailment (RTE), also called Natural Language
125
  Inference (NLI), is the task of predicting if a given text (premise) entails (implies) in
126
  other text (hypothesis)."
127
+ link: https://dl.acm.org/doi/abs/10.1007/978-3-030-41505-1_39
128
+ sources: ["https://huggingface.co/datasets/eduagarcia/portuguese_benchmark", "https://sites.google.com/view/assin2/", "https://huggingface.co/datasets/assin2"]
129
  assin2_sts:
130
  benchmark: assin2_sts
131
  col_name: ASSIN2 STS
 
139
  expert_human_baseline: null
140
  description: "Same as dataset as above. Semantic Textual Similarity (STS)
141
  ‘measures the degree of semantic equivalence between two sentences’."
142
+ link: https://dl.acm.org/doi/abs/10.1007/978-3-030-41505-1_39
143
+ sources: ["https://huggingface.co/datasets/eduagarcia/portuguese_benchmark", "https://sites.google.com/view/assin2/", "https://huggingface.co/datasets/assin2"]
144
  faquad_nli:
145
  benchmark: faquad_nli
146
  col_name: FAQUAD NLI
 
161
  Brazilian higher education system. FaQuAD-NLI is a modified version of the
162
  FaQuAD dataset that repurposes the question answering task as a textual
163
  entailment task between a question and its possible answers."
164
+ link: https://ieeexplore.ieee.org/abstract/document/8923668
165
+ sources: ["https://github.com/liafacom/faquad/", "https://huggingface.co/datasets/ruanchaves/faquad-nli"]
166
  hatebr_offensive:
167
  benchmark: hatebr_offensive
168
  col_name: HateBR Offensive
 
178
  on the web and social media. The HateBR was collected from Brazilian Instagram comments of politicians and manually annotated
179
  by specialists. It is composed of 7,000 documents annotated with a binary classification (offensive
180
  versus non-offensive comments)."
181
+ link: https://arxiv.org/abs/2103.14972
182
+ sources: ["https://huggingface.co/datasets/eduagarcia/portuguese_benchmark", "https://github.com/franciellevargas/HateBR", "https://huggingface.co/datasets/ruanchaves/hatebr"]
183
  portuguese_hate_speech:
184
  benchmark: portuguese_hate_speech
185
  col_name: PT Hate Speech
 
192
  human_baseline: null
193
  expert_human_baseline: null
194
  description: "Portuguese dataset for hate speech detection composed of 5,668 tweets with binary annotations (i.e. 'hate' vs. 'no-hate')"
195
+ link: https://aclanthology.org/W19-3510/
196
+ sources: ["https://huggingface.co/datasets/eduagarcia/portuguese_benchmark", "https://github.com/paulafortuna/Portuguese-Hate-Speech-Dataset", "https://huggingface.co/datasets/hate_speech_portuguese"]
197
  tweetsentbr:
198
  benchmark: tweetsentbr
199
  col_name: tweetSentBR
 
209
  It was labeled by several annotators following steps stablished on the literature for
210
  improving reliability on the task of Sentiment Analysis. Each Tweet was annotated
211
  in one of the three following classes: Positive, Negative, Neutral."
212
+ link: https://arxiv.org/abs/1712.08917
213
+ sources: ["https://bitbucket.org/HBrum/tweetsentbr"]
214