High quality pretraining and instruction datasets for law, mathematics, and science.
Casey
casey-martin
AI & ML interests
Biomedical Tool Usage
Graph Learning
Ecophysiology
Recent Activity
liked
a dataset
7 days ago
microsoft/orca-agentinstruct-1M-v1
liked
a dataset
8 days ago
OpenCoder-LLM/opc-annealing-corpus
liked
a dataset
11 days ago
OpenCoder-LLM/opc-sft-stage1
Organizations
Collections
1
models
None public yet
datasets
8
casey-martin/math_notebooks
Viewer
•
Updated
•
18.1k
•
42
casey-martin/CommonLit-Ease-of-Readability
Viewer
•
Updated
•
4.72k
•
7
•
1
casey-martin/multilingual-mathematical-autoformalization
Viewer
•
Updated
•
666k
•
135
•
1
casey-martin/MedInstruct
Preview
•
Updated
•
54
•
6
casey-martin/qald_9_plus
Viewer
•
Updated
•
15.8k
•
142
casey-martin/vquanda
Viewer
•
Updated
•
5k
•
38
•
3
casey-martin/protocols_io
Updated
•
36
casey-martin/oa_cpp_annotate_gen
Viewer
•
Updated
•
104k
•
50
•
2