Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
eduagarcia
commited on
Commit
•
1a411ea
1
Parent(s):
de3b367
change base datasets links to the dataset original paper
Browse files- tasks_config/pt_config.yaml +18 -18
tasks_config/pt_config.yaml
CHANGED
@@ -62,8 +62,8 @@ tasks:
|
|
62 |
level exam widely applied every year by the Brazilian government to students that
|
63 |
wish to undertake a University degree. This dataset contains 1,430 questions that don't require
|
64 |
image understanding of the exams from 2010 to 2018, 2022 and 2023."
|
65 |
-
link: https://
|
66 |
-
sources: ["https://www.ime.usp.br/~ddm/project/enem/", "https://github.com/piresramon/gpt-4-enem", "https://huggingface.co/datasets/maritaca-ai/enem"]
|
67 |
baseline_sources: ["https://www.sejalguem.com/enem", "https://vestibular.brasilescola.uol.com.br/enem/confira-as-medias-e-notas-maximas-e-minimas-do-enem-2020/349732.html"]
|
68 |
bluex:
|
69 |
benchmark: bluex
|
@@ -81,8 +81,8 @@ tasks:
|
|
81 |
description: "BLUEX is a multimodal dataset consisting of the two leading
|
82 |
university entrance exams conducted in Brazil: Convest (Unicamp) and Fuvest (USP),
|
83 |
spanning from 2018 to 2024. The benchmark comprises of 724 questions that do not have accompanying images"
|
84 |
-
link: https://
|
85 |
-
sources: ["https://github.com/portuguese-benchmark-datasets/bluex", "https://huggingface.co/datasets/portuguese-benchmark-datasets/BLUEX"]
|
86 |
baseline_sources: ["https://www.comvest.unicamp.br/wp-content/uploads/2023/08/Relatorio_F1_2023.pdf", "https://acervo.fuvest.br/fuvest/2018/FUVEST_2018_indice_discriminacao_1_fase_ins.pdf"]
|
87 |
oab_exams:
|
88 |
benchmark: oab_exams
|
@@ -104,8 +104,8 @@ tasks:
|
|
104 |
expert_human_baseline: 75.0
|
105 |
description: OAB Exams is a dataset of more than 2,000 questions from the Brazilian Bar
|
106 |
Association's exams, from 2010 to 2018.
|
107 |
-
link: https://
|
108 |
-
sources: ["https://github.com/legal-nlp/oab-exams"]
|
109 |
baseline_sources: ["http://fgvprojetos.fgv.br/publicacao/exame-de-ordem-em-numeros", "http://fgvprojetos.fgv.br/publicacao/exame-de-ordem-em-numeros-vol2", "http://fgvprojetos.fgv.br/publicacao/exame-de-ordem-em-numeros-vol3"]
|
110 |
assin2_rte:
|
111 |
benchmark: assin2_rte
|
@@ -124,8 +124,8 @@ tasks:
|
|
124 |
of Portuguese. Recognising Textual Entailment (RTE), also called Natural Language
|
125 |
Inference (NLI), is the task of predicting if a given text (premise) entails (implies) in
|
126 |
other text (hypothesis)."
|
127 |
-
link: https://
|
128 |
-
sources: ["https://sites.google.com/view/assin2/", "https://huggingface.co/datasets/assin2"]
|
129 |
assin2_sts:
|
130 |
benchmark: assin2_sts
|
131 |
col_name: ASSIN2 STS
|
@@ -139,8 +139,8 @@ tasks:
|
|
139 |
expert_human_baseline: null
|
140 |
description: "Same as dataset as above. Semantic Textual Similarity (STS)
|
141 |
‘measures the degree of semantic equivalence between two sentences’."
|
142 |
-
link: https://
|
143 |
-
sources: ["https://sites.google.com/view/assin2/", "https://huggingface.co/datasets/assin2"]
|
144 |
faquad_nli:
|
145 |
benchmark: faquad_nli
|
146 |
col_name: FAQUAD NLI
|
@@ -161,8 +161,8 @@ tasks:
|
|
161 |
Brazilian higher education system. FaQuAD-NLI is a modified version of the
|
162 |
FaQuAD dataset that repurposes the question answering task as a textual
|
163 |
entailment task between a question and its possible answers."
|
164 |
-
link: https://
|
165 |
-
sources: ["https://github.com/liafacom/faquad/"]
|
166 |
hatebr_offensive:
|
167 |
benchmark: hatebr_offensive
|
168 |
col_name: HateBR Offensive
|
@@ -178,8 +178,8 @@ tasks:
|
|
178 |
on the web and social media. The HateBR was collected from Brazilian Instagram comments of politicians and manually annotated
|
179 |
by specialists. It is composed of 7,000 documents annotated with a binary classification (offensive
|
180 |
versus non-offensive comments)."
|
181 |
-
link: https://
|
182 |
-
sources: ["https://github.com/franciellevargas/HateBR", "https://huggingface.co/datasets/
|
183 |
portuguese_hate_speech:
|
184 |
benchmark: portuguese_hate_speech
|
185 |
col_name: PT Hate Speech
|
@@ -192,8 +192,8 @@ tasks:
|
|
192 |
human_baseline: null
|
193 |
expert_human_baseline: null
|
194 |
description: "Portuguese dataset for hate speech detection composed of 5,668 tweets with binary annotations (i.e. 'hate' vs. 'no-hate')"
|
195 |
-
link: https://
|
196 |
-
sources: ["https://github.com/paulafortuna/Portuguese-Hate-Speech-Dataset", "https://huggingface.co/datasets/hate_speech_portuguese"]
|
197 |
tweetsentbr:
|
198 |
benchmark: tweetsentbr
|
199 |
col_name: tweetSentBR
|
@@ -209,6 +209,6 @@ tasks:
|
|
209 |
It was labeled by several annotators following steps stablished on the literature for
|
210 |
improving reliability on the task of Sentiment Analysis. Each Tweet was annotated
|
211 |
in one of the three following classes: Positive, Negative, Neutral."
|
212 |
-
link: https://
|
213 |
-
sources: ["https://bitbucket.org/HBrum/tweetsentbr"
|
214 |
|
|
|
62 |
level exam widely applied every year by the Brazilian government to students that
|
63 |
wish to undertake a University degree. This dataset contains 1,430 questions that don't require
|
64 |
image understanding of the exams from 2010 to 2018, 2022 and 2023."
|
65 |
+
link: https://www.ime.usp.br/~ddm/project/enem/ENEM-GuidingTest.pdf
|
66 |
+
sources: ["https://huggingface.co/datasets/eduagarcia/enem_challenge", "https://www.ime.usp.br/~ddm/project/enem/", "https://github.com/piresramon/gpt-4-enem", "https://huggingface.co/datasets/maritaca-ai/enem"]
|
67 |
baseline_sources: ["https://www.sejalguem.com/enem", "https://vestibular.brasilescola.uol.com.br/enem/confira-as-medias-e-notas-maximas-e-minimas-do-enem-2020/349732.html"]
|
68 |
bluex:
|
69 |
benchmark: bluex
|
|
|
81 |
description: "BLUEX is a multimodal dataset consisting of the two leading
|
82 |
university entrance exams conducted in Brazil: Convest (Unicamp) and Fuvest (USP),
|
83 |
spanning from 2018 to 2024. The benchmark comprises of 724 questions that do not have accompanying images"
|
84 |
+
link: https://arxiv.org/abs/2307.05410
|
85 |
+
sources: ["https://huggingface.co/datasets/eduagarcia-temp/BLUEX_without_images", "https://github.com/portuguese-benchmark-datasets/bluex", "https://huggingface.co/datasets/portuguese-benchmark-datasets/BLUEX"]
|
86 |
baseline_sources: ["https://www.comvest.unicamp.br/wp-content/uploads/2023/08/Relatorio_F1_2023.pdf", "https://acervo.fuvest.br/fuvest/2018/FUVEST_2018_indice_discriminacao_1_fase_ins.pdf"]
|
87 |
oab_exams:
|
88 |
benchmark: oab_exams
|
|
|
104 |
expert_human_baseline: 75.0
|
105 |
description: OAB Exams is a dataset of more than 2,000 questions from the Brazilian Bar
|
106 |
Association's exams, from 2010 to 2018.
|
107 |
+
link: https://arxiv.org/abs/1712.05128
|
108 |
+
sources: ["https://huggingface.co/datasets/eduagarcia/oab_exams", "https://github.com/legal-nlp/oab-exams"]
|
109 |
baseline_sources: ["http://fgvprojetos.fgv.br/publicacao/exame-de-ordem-em-numeros", "http://fgvprojetos.fgv.br/publicacao/exame-de-ordem-em-numeros-vol2", "http://fgvprojetos.fgv.br/publicacao/exame-de-ordem-em-numeros-vol3"]
|
110 |
assin2_rte:
|
111 |
benchmark: assin2_rte
|
|
|
124 |
of Portuguese. Recognising Textual Entailment (RTE), also called Natural Language
|
125 |
Inference (NLI), is the task of predicting if a given text (premise) entails (implies) in
|
126 |
other text (hypothesis)."
|
127 |
+
link: https://dl.acm.org/doi/abs/10.1007/978-3-030-41505-1_39
|
128 |
+
sources: ["https://huggingface.co/datasets/eduagarcia/portuguese_benchmark", "https://sites.google.com/view/assin2/", "https://huggingface.co/datasets/assin2"]
|
129 |
assin2_sts:
|
130 |
benchmark: assin2_sts
|
131 |
col_name: ASSIN2 STS
|
|
|
139 |
expert_human_baseline: null
|
140 |
description: "Same as dataset as above. Semantic Textual Similarity (STS)
|
141 |
‘measures the degree of semantic equivalence between two sentences’."
|
142 |
+
link: https://dl.acm.org/doi/abs/10.1007/978-3-030-41505-1_39
|
143 |
+
sources: ["https://huggingface.co/datasets/eduagarcia/portuguese_benchmark", "https://sites.google.com/view/assin2/", "https://huggingface.co/datasets/assin2"]
|
144 |
faquad_nli:
|
145 |
benchmark: faquad_nli
|
146 |
col_name: FAQUAD NLI
|
|
|
161 |
Brazilian higher education system. FaQuAD-NLI is a modified version of the
|
162 |
FaQuAD dataset that repurposes the question answering task as a textual
|
163 |
entailment task between a question and its possible answers."
|
164 |
+
link: https://ieeexplore.ieee.org/abstract/document/8923668
|
165 |
+
sources: ["https://github.com/liafacom/faquad/", "https://huggingface.co/datasets/ruanchaves/faquad-nli"]
|
166 |
hatebr_offensive:
|
167 |
benchmark: hatebr_offensive
|
168 |
col_name: HateBR Offensive
|
|
|
178 |
on the web and social media. The HateBR was collected from Brazilian Instagram comments of politicians and manually annotated
|
179 |
by specialists. It is composed of 7,000 documents annotated with a binary classification (offensive
|
180 |
versus non-offensive comments)."
|
181 |
+
link: https://arxiv.org/abs/2103.14972
|
182 |
+
sources: ["https://huggingface.co/datasets/eduagarcia/portuguese_benchmark", "https://github.com/franciellevargas/HateBR", "https://huggingface.co/datasets/ruanchaves/hatebr"]
|
183 |
portuguese_hate_speech:
|
184 |
benchmark: portuguese_hate_speech
|
185 |
col_name: PT Hate Speech
|
|
|
192 |
human_baseline: null
|
193 |
expert_human_baseline: null
|
194 |
description: "Portuguese dataset for hate speech detection composed of 5,668 tweets with binary annotations (i.e. 'hate' vs. 'no-hate')"
|
195 |
+
link: https://aclanthology.org/W19-3510/
|
196 |
+
sources: ["https://huggingface.co/datasets/eduagarcia/portuguese_benchmark", "https://github.com/paulafortuna/Portuguese-Hate-Speech-Dataset", "https://huggingface.co/datasets/hate_speech_portuguese"]
|
197 |
tweetsentbr:
|
198 |
benchmark: tweetsentbr
|
199 |
col_name: tweetSentBR
|
|
|
209 |
It was labeled by several annotators following steps stablished on the literature for
|
210 |
improving reliability on the task of Sentiment Analysis. Each Tweet was annotated
|
211 |
in one of the three following classes: Positive, Negative, Neutral."
|
212 |
+
link: https://arxiv.org/abs/1712.08917
|
213 |
+
sources: ["https://bitbucket.org/HBrum/tweetsentbr"]
|
214 |
|