Mechanistic Permutability: Match Features Across Layers Paper • 2410.07656 • Published about 1 month ago • 16
Molmo Collection Artifacts for open multimodal language models. • 5 items • Updated Sep 26 • 269
XGen-MM-1 models and datasets Collection A collection of all XGen-MM (Foundation LMM) models! • 15 items • Updated 5 days ago • 34
PDF Document / OCR Datasets Collection Document datasets with .pdf files that are usable with pixparse libraries and tools. • 2 items • Updated Mar 30 • 47
Visual Scorers! Collection Variants of Visual Evaluation Models proposed by [Q-Align: Teaching LMMs for Visual Scoring via Discrete Text-defined Levels]. Use by `model.score()`! • 8 items • Updated Jun 14 • 2
Gemma 2 2B Release Collection The 2.6B parameter version of Gemma 2. • 6 items • Updated Jul 31 • 76
AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents Paper • 2407.18901 • Published Jul 26 • 31
Theia: Distilling Diverse Vision Foundation Models for Robot Learning Paper • 2407.20179 • Published Jul 29 • 45
Wolf: Captioning Everything with a World Summarization Framework Paper • 2407.18908 • Published Jul 26 • 30
WebUI (CHI 2023) Collection Learning Mobile User Interface Representation with Web Semantics • 23 items • Updated 8 days ago • 4
GUI Odyssey: A Comprehensive Dataset for Cross-App GUI Navigation on Mobile Devices Paper • 2406.08451 • Published Jun 12 • 23
Vikhr: The Family of Open-Source Instruction-Tuned Large Language Models for Russian Paper • 2405.13929 • Published May 22 • 52
Many-Shot In-Context Learning in Multimodal Foundation Models Paper • 2405.09798 • Published May 16 • 26
Chameleon: Mixed-Modal Early-Fusion Foundation Models Paper • 2405.09818 • Published May 16 • 126