LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper ⢠2411.10440 ⢠Published 6 days ago ⢠89
LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models Paper ⢠2411.09595 ⢠Published 7 days ago ⢠65
BLIP3-KALE: Knowledge Augmented Large-Scale Dense Captions Paper ⢠2411.07461 ⢠Published 10 days ago ⢠21
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents Paper ⢠2410.23218 ⢠Published 22 days ago ⢠46
MobileLLM Collection Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 ⢠8 items ⢠Updated 15 days ago ⢠95
MarDini: Masked Autoregressive Diffusion for Video Generation at Scale Paper ⢠2410.20280 ⢠Published 26 days ago ⢠21
CogVLM2 Collection This collection hosts the repos of the THUDM's CogVLM2 releases ⢠8 items ⢠Updated Aug 18 ⢠18
LoLCATS Collection Linearizing LLMs with high quality and efficiency. We linearize the full Llama 3.1 model family -- 8b, 70b, 405b -- for the first time! ⢠4 items ⢠Updated Oct 14 ⢠14
based Collection These language model checkpoints are trained at the 360M and 1.3Bn parameter scales for up to 50Bn tokens on the Pile corpus, for research purposes. ⢠15 items ⢠Updated Oct 18 ⢠9
Animate-X: Universal Character Image Animation with Enhanced Motion Representation Paper ⢠2410.10306 ⢠Published Oct 14 ⢠52
Cavia: Camera-controllable Multi-view Video Diffusion with View-Integrated Attention Paper ⢠2410.10774 ⢠Published Oct 14 ⢠24
Loradex Highlights Collection This collection features awesome opensource LoRAs trained by members of the Glif Community during Loradex Early Access! ⢠14 items ⢠Updated Oct 18 ⢠18
Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations Paper ⢠2410.10792 ⢠Published Oct 14 ⢠26
African History Collection A collection of data on the history of mankind ⢠5 items ⢠Updated 13 days ago ⢠1