Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions Paper β’ 2411.14405 β’ Published 2 days ago β’ 30
view article Article The Open Medical-LLM Leaderboard: Benchmarking Large Language Models in Healthcare Apr 19 β’ 119
SimPO Collection This collections contains a list of SimPO and baseline models. β’ 49 items β’ Updated 17 days ago β’ 15
AV LLMs Collection A collection of Audio, Video and Visual LLMs. β’ 48 items β’ Updated about 4 hours ago β’ 3
PDF Document / OCR Datasets Collection Document datasets with .pdf files that are usable with pixparse libraries and tools. β’ 2 items β’ Updated Mar 30 β’ 47
Document VQA Datasets Collection Document question & answer datasets that have been tested with pixparse libraries and tools. β’ 2 items β’ Updated Mar 29 β’ 1
The Big Benchmarks Collection Collection Gathering benchmark spaces on the hub (beyond the Open LLM Leaderboard) β’ 13 items β’ Updated 6 days ago β’ 158
Open LLM Leaderboard best models β€οΈβπ₯ Collection A daily uploaded list of models with best evaluations on the LLM leaderboard: β’ 60 items β’ Updated about 1 hour ago β’ 443
Whisper Release Collection Whisper includes both English-only and multilingual checkpoints for ASR and ST, ranging from 38M params for the tiny models to 1.5B params for large. β’ 12 items β’ Updated Sep 13, 2023 β’ 89
PaLI-3 Vision Language Models: Smaller, Faster, Stronger Paper β’ 2310.09199 β’ Published Oct 13, 2023 β’ 24
GIT Collection GIT (Generative Image-to-text Transformer) is a model useful for vision-language tasks such as image/video captioning and question answering. β’ 18 items β’ Updated Jul 11 β’ 10
UDOP Collection UDOP is a general multimodal model for document AI β’ 4 items β’ Updated Jul 11 β’ 23
SpeechT5 Collection The SpeechT5 framework consists of a shared seq2seq and six modal-specific (speech/text) pre/post-nets that can address a few audio-related tasks. β’ 8 items β’ Updated Jul 11 β’ 22