FineWeb2 Edu Japanese: A high-quality, filtered Japanese dataset (120M texts, 89.3B tokens) for educational AI training.
Yuichi Tateno PRO
hotchpotch
AI & ML interests
Information Retrieval with LLMs
Recent Activity
liked
a dataset
4 days ago
sbintuitions/JMTEB-lite
liked
a model
11 days ago
naver/xprovence-reranker-bgem3-v1
published
a dataset
11 days ago
hotchpotch/lawqa_jp
Organizations
japanese-reranker
日本語rerankerシリーズ
-
hotchpotch/japanese-reranker-tiny-v2
Text Ranking • 29.4M • Updated • 465 • 6 -
hotchpotch/japanese-reranker-xsmall-v2
Text Ranking • 36.8M • Updated • 51.5k • 2 -
hotchpotch/japanese-reranker-small-v2
Text Ranking • 70.2M • Updated • 528 • 2 -
hotchpotch/japanese-reranker-base-v2
Text Ranking • 0.1B • Updated • 655 • 4
FineWeb2 Edu Japanese
FineWeb2 Edu Japanese: A high-quality, filtered Japanese dataset (120M texts, 89.3B tokens) for educational AI training.
japanese-reranker
日本語rerankerシリーズ
-
hotchpotch/japanese-reranker-tiny-v2
Text Ranking • 29.4M • Updated • 465 • 6 -
hotchpotch/japanese-reranker-xsmall-v2
Text Ranking • 36.8M • Updated • 51.5k • 2 -
hotchpotch/japanese-reranker-small-v2
Text Ranking • 70.2M • Updated • 528 • 2 -
hotchpotch/japanese-reranker-base-v2
Text Ranking • 0.1B • Updated • 655 • 4
spaces
4
Runtime error
3
Japanese Splade Demo Streamlit
📉
Convert text to SPLADE token scores
Sleeping
TokenViz: AutoTokenizer Visualization Tool
🔍
Visualize the results of AutoTokenizer
Running
Secon Dev Site Search
🐨
Running
5
Wikipedia Japanese Rag Search
😻
Ask questions about Wikipedia articles in Japanese
models
35

hotchpotch/japanese-reranker-small-v2
Text Ranking
•
70.2M
•
Updated
•
528
•
2

hotchpotch/japanese-reranker-base-v2
Text Ranking
•
0.1B
•
Updated
•
655
•
4

hotchpotch/japanese-reranker-xsmall-v2
Text Ranking
•
36.8M
•
Updated
•
51.5k
•
2

hotchpotch/japanese-reranker-tiny-v2
Text Ranking
•
29.4M
•
Updated
•
465
•
6

hotchpotch/japanese-reranker-cross-encoder-small-v1
Text Ranking
•
0.1B
•
Updated
•
8.24k
•
3

hotchpotch/japanese-reranker-cross-encoder-base-v1
Text Ranking
•
0.1B
•
Updated
•
702
•
1

hotchpotch/japanese-reranker-cross-encoder-large-v1
Text Ranking
•
0.3B
•
Updated
•
14.6k
•
16

hotchpotch/japanese-bge-reranker-v2-m3-v1
Text Ranking
•
0.6B
•
Updated
•
565
•
15

hotchpotch/japanese-reranker-cross-encoder-xsmall-v1
Text Ranking
•
0.1B
•
Updated
•
3.71k
•
7

hotchpotch/query-crafter-japanese-Qwen3-1.7B
2B
•
Updated
•
22
•
11
datasets
25
hotchpotch/lawqa_jp
Viewer
•
Updated
•
1.29k
•
29
hotchpotch/miracl-hf-unified
Viewer
•
Updated
•
106M
•
620
hotchpotch/JFWIR
Viewer
•
Updated
•
128M
•
153
•
4
hotchpotch/fineweb-2-edu-japanese
Viewer
•
Updated
•
262M
•
1.05k
•
20
hotchpotch/japanese-query-crafter-reasoning-80k
Viewer
•
Updated
•
83.3k
•
74
•
3
hotchpotch/tmp-5M-qa-small-tokens-cleaned
Viewer
•
Updated
•
5M
•
29
hotchpotch/japanese-qa-reasoning-100k
Viewer
•
Updated
•
106k
•
28
•
2
hotchpotch/fineweb-2-edu-japanese-noise-detect-raw
Viewer
•
Updated
•
64.2M
•
41
hotchpotch/fineweb-2-japanese-noise-spans
Viewer
•
Updated
•
344k
•
19
hotchpotch/fineweb-2-edu-japanese-scores
Viewer
•
Updated
•
313k
•
24
•
1