Addition is All You Need for Energy-efficient Language Models Paper • 2410.00907 • Published Oct 1, 2024 • 145
Searching for Better ViT Baselines Collection Exploring ViT hparams and model shapes for the GPU poor (between tiny and base). • 25 items • Updated Aug 21, 2024 • 16
Generative AI for Math: Part I -- MathPile: A Billion-Token-Scale Pretraining Corpus for Math Paper • 2312.17120 • Published Dec 28, 2023 • 26
Learning Vision from Models Rivals Learning Vision from Data Paper • 2312.17742 • Published Dec 28, 2023 • 16
PanGu-π: Enhancing Language Model Architectures via Nonlinearity Compensation Paper • 2312.17276 • Published Dec 27, 2023 • 16
FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis Paper • 2312.17681 • Published Dec 29, 2023 • 19
Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions Paper • 2401.01827 • Published Jan 3, 2024 • 18
SIGNeRF: Scene Integrated Generation for Neural Radiance Fields Paper • 2401.01647 • Published Jan 3, 2024 • 13
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations Paper • 2401.01885 • Published Jan 3, 2024 • 28
Q-Refine: A Perceptual Quality Refiner for AI-Generated Image Paper • 2401.01117 • Published Jan 2, 2024 • 10
VideoDrafter: Content-Consistent Multi-Scene Video Generation with LLM Paper • 2401.01256 • Published Jan 2, 2024 • 21
LLaMA Beyond English: An Empirical Study on Language Capability Transfer Paper • 2401.01055 • Published Jan 2, 2024 • 54
DocLLM: A layout-aware generative language model for multimodal document understanding Paper • 2401.00908 • Published Dec 31, 2023 • 181
TrailBlazer: Trajectory Control for Diffusion-Based Video Generation Paper • 2401.00896 • Published Dec 31, 2023 • 16