Specimen Label Transcription Project

university

AI & ML interests

This is a repository for open-source datasets and models for natural history collection label transcription using LLMs, hosted by the University of Michigan Herbarium. Partner institutions will periodically add new training datasets (OCR and human-transcribed datasets) and benchmarking datasets used to rank model/method performance. For code to create benchmark datasets or analyze model performance, please visit the GitHub repo. To join the SLTP initiative, please email willwe@umich.edu.

models 0

None public yet

datasets 1

SLTP/HLT-AA-C21-Alpaca

Viewer • Updated Jun 16, 2023 • 6.13k • 6 • 2

AI & ML interests

Team members 5

models 0

datasets 1