Falcon3 Collection Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B parameters. • 36 items • Updated about 10 hours ago • 53
InternVL 2.5 Collection Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling • 18 items • Updated about 13 hours ago • 67
Qwen2.5-Coder Collection Code-specific model series based on Qwen2.5 • 40 items • Updated 20 days ago • 253
Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data Paper • 2410.18558 • Published Oct 24 • 18
Granite 3.0 Language Models Collection A series of language models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 8 items • Updated 11 days ago • 95
LayerSkip Collection Models continually pretrained using LayerSkip - https://arxiv.org/abs/2404.16710 • 8 items • Updated 26 days ago • 45
TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse Attention Paper • 2410.05076 • Published Oct 7 • 7
MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models Paper • 2409.17481 • Published Sep 26 • 46
LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations Paper • 2410.02707 • Published Oct 3 • 47
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning Paper • 2410.02884 • Published Oct 3 • 51
Addition is All You Need for Energy-efficient Language Models Paper • 2410.00907 • Published Oct 1 • 144
Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction Paper • 2409.17422 • Published Sep 25 • 24
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions Paper • 2409.18042 • Published Sep 26 • 36