Spaces:

panuthept
/

thai_sentence_embedding_benchmark

Runtime error

App Files Files Community

panuthept commited on Aug 8

Commit

2d21ea4

•

1 Parent(s): 79223b9

add dataset details

Browse files

Files changed (1) hide show

app.py +20 -0

app.py CHANGED Viewed

@@ -6,6 +6,26 @@ TITLE = """<h1 align="center" id="space-title">🇹🇭 Thai Sentence Embedding
 INTRODUCTION_TEXT = """
 📐 The 🇹🇭 Thai Sentence Embedding Leaderboard aims to track, rank and evaluate open embedding models on Thai sentence embedding tasks. Source code for evaluation at https://github.com/mrpeerat/Thai-Sentence-Vector-Benchmark, feel free to submit your own score at https://huggingface.co/spaces/panuthept/thai_sentence_embedding_benchmark/discussions.
 ## Tagging
 🟢 Open sourced 📦 API
 """

 INTRODUCTION_TEXT = """
 📐 The 🇹🇭 Thai Sentence Embedding Leaderboard aims to track, rank and evaluate open embedding models on Thai sentence embedding tasks. Source code for evaluation at https://github.com/mrpeerat/Thai-Sentence-Vector-Benchmark, feel free to submit your own score at https://huggingface.co/spaces/panuthept/thai_sentence_embedding_benchmark/discussions.
+## Dataset
+The evaluation is conducted on 8 datasets across 4 tasks:
+1. Semantic Textual Similarity (STS)
+- Translated STS-B, contains 1,379 test samples, https://github.com/mrpeerat/Thai-Sentence-Vector-Benchmark
+2. Text Classification
+- Wisesight, contains 2,671 test samples, https://huggingface.co/datasets/pythainlp/wisesight_sentiment
+- Wongnai, contains 6,203 test samples, https://huggingface.co/datasets/Wongnai/wongnai_reviews
+- Generated Review, contains 17,453 test samples, https://huggingface.co/datasets/airesearch/generated_reviews_enth
+3. Pair Classification
+- XNLI (Thai only), contains 3,340 test samples, https://github.com/facebookresearch/XNLI
+4. Retrieval
+- XQuAD (Thai only), contains 1,190 test samples, https://huggingface.co/datasets/google/xquad
+- MIRACL (Thai only), contains 733 test samples, https://huggingface.co/datasets/miracl/miracl
+- TyDiQA (Thai only), contains 763 test samples, https://huggingface.co/datasets/chompk/tydiqa-goldp-th
+## Metrics
+The evaluation metrics for each task are as follows:
+1. STS -> Spearman correlation
+2. Text Classification -> F1
+3. Pair Classification -> Average Precision
+3. Retrieval -> MMR@10
 ## Tagging
 🟢 Open sourced 📦 API
 """