panuthept commited on
Commit
2d21ea4
β€’
1 Parent(s): 79223b9

add dataset details

Browse files
Files changed (1) hide show
  1. app.py +20 -0
app.py CHANGED
@@ -6,6 +6,26 @@ TITLE = """<h1 align="center" id="space-title">πŸ‡ΉπŸ‡­ Thai Sentence Embedding
6
 
7
  INTRODUCTION_TEXT = """
8
  πŸ“ The πŸ‡ΉπŸ‡­ Thai Sentence Embedding Leaderboard aims to track, rank and evaluate open embedding models on Thai sentence embedding tasks. Source code for evaluation at https://github.com/mrpeerat/Thai-Sentence-Vector-Benchmark, feel free to submit your own score at https://huggingface.co/spaces/panuthept/thai_sentence_embedding_benchmark/discussions.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  ## Tagging
10
  🟒 Open sourced πŸ“¦ API
11
  """
 
6
 
7
  INTRODUCTION_TEXT = """
8
  πŸ“ The πŸ‡ΉπŸ‡­ Thai Sentence Embedding Leaderboard aims to track, rank and evaluate open embedding models on Thai sentence embedding tasks. Source code for evaluation at https://github.com/mrpeerat/Thai-Sentence-Vector-Benchmark, feel free to submit your own score at https://huggingface.co/spaces/panuthept/thai_sentence_embedding_benchmark/discussions.
9
+ ## Dataset
10
+ The evaluation is conducted on 8 datasets across 4 tasks:
11
+ 1. Semantic Textual Similarity (STS)
12
+ - Translated STS-B, contains 1,379 test samples, https://github.com/mrpeerat/Thai-Sentence-Vector-Benchmark
13
+ 2. Text Classification
14
+ - Wisesight, contains 2,671 test samples, https://huggingface.co/datasets/pythainlp/wisesight_sentiment
15
+ - Wongnai, contains 6,203 test samples, https://huggingface.co/datasets/Wongnai/wongnai_reviews
16
+ - Generated Review, contains 17,453 test samples, https://huggingface.co/datasets/airesearch/generated_reviews_enth
17
+ 3. Pair Classification
18
+ - XNLI (Thai only), contains 3,340 test samples, https://github.com/facebookresearch/XNLI
19
+ 4. Retrieval
20
+ - XQuAD (Thai only), contains 1,190 test samples, https://huggingface.co/datasets/google/xquad
21
+ - MIRACL (Thai only), contains 733 test samples, https://huggingface.co/datasets/miracl/miracl
22
+ - TyDiQA (Thai only), contains 763 test samples, https://huggingface.co/datasets/chompk/tydiqa-goldp-th
23
+ ## Metrics
24
+ The evaluation metrics for each task are as follows:
25
+ 1. STS -> Spearman correlation
26
+ 2. Text Classification -> F1
27
+ 3. Pair Classification -> Average Precision
28
+ 3. Retrieval -> MMR@10
29
  ## Tagging
30
  🟒 Open sourced πŸ“¦ API
31
  """