Specimen Label Transcription Project

university

AI & ML interests

This is a repository for open-source datasets and models for natural history collection label transcription using LLMs, hosted by the University of Michigan Herbarium. Partner institutions will periodically add new training datasets (OCR and human-transcribed datasets) and benchmarking datasets used to rank model/method performance. For code to create benchmark datasets or analyze model performance, please visit the GitHub repo. To join the SLTP initiative, please email willwe@umich.edu.

models

None public yet