Spaces:
Runtime error
Runtime error
add dataset details
Browse files
app.py
CHANGED
@@ -6,6 +6,26 @@ TITLE = """<h1 align="center" id="space-title">πΉπ Thai Sentence Embedding
|
|
6 |
|
7 |
INTRODUCTION_TEXT = """
|
8 |
π The πΉπ Thai Sentence Embedding Leaderboard aims to track, rank and evaluate open embedding models on Thai sentence embedding tasks. Source code for evaluation at https://github.com/mrpeerat/Thai-Sentence-Vector-Benchmark, feel free to submit your own score at https://huggingface.co/spaces/panuthept/thai_sentence_embedding_benchmark/discussions.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
9 |
## Tagging
|
10 |
π’ Open sourced π¦ API
|
11 |
"""
|
|
|
6 |
|
7 |
INTRODUCTION_TEXT = """
|
8 |
π The πΉπ Thai Sentence Embedding Leaderboard aims to track, rank and evaluate open embedding models on Thai sentence embedding tasks. Source code for evaluation at https://github.com/mrpeerat/Thai-Sentence-Vector-Benchmark, feel free to submit your own score at https://huggingface.co/spaces/panuthept/thai_sentence_embedding_benchmark/discussions.
|
9 |
+
## Dataset
|
10 |
+
The evaluation is conducted on 8 datasets across 4 tasks:
|
11 |
+
1. Semantic Textual Similarity (STS)
|
12 |
+
- Translated STS-B, contains 1,379 test samples, https://github.com/mrpeerat/Thai-Sentence-Vector-Benchmark
|
13 |
+
2. Text Classification
|
14 |
+
- Wisesight, contains 2,671 test samples, https://huggingface.co/datasets/pythainlp/wisesight_sentiment
|
15 |
+
- Wongnai, contains 6,203 test samples, https://huggingface.co/datasets/Wongnai/wongnai_reviews
|
16 |
+
- Generated Review, contains 17,453 test samples, https://huggingface.co/datasets/airesearch/generated_reviews_enth
|
17 |
+
3. Pair Classification
|
18 |
+
- XNLI (Thai only), contains 3,340 test samples, https://github.com/facebookresearch/XNLI
|
19 |
+
4. Retrieval
|
20 |
+
- XQuAD (Thai only), contains 1,190 test samples, https://huggingface.co/datasets/google/xquad
|
21 |
+
- MIRACL (Thai only), contains 733 test samples, https://huggingface.co/datasets/miracl/miracl
|
22 |
+
- TyDiQA (Thai only), contains 763 test samples, https://huggingface.co/datasets/chompk/tydiqa-goldp-th
|
23 |
+
## Metrics
|
24 |
+
The evaluation metrics for each task are as follows:
|
25 |
+
1. STS -> Spearman correlation
|
26 |
+
2. Text Classification -> F1
|
27 |
+
3. Pair Classification -> Average Precision
|
28 |
+
3. Retrieval -> MMR@10
|
29 |
## Tagging
|
30 |
π’ Open sourced π¦ API
|
31 |
"""
|