Falcon-H1 Collection Falcon-H1 Family of Hybrid-Head Language Models (Transformer-SSM), including 0.5B, 1.5B, 1.5B-Deep, 3B, 7B, and 34B (pretrained & instruction-tuned). • 38 items • Updated 27 days ago • 57
Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance Paper • 2507.22448 • Published Jul 30 • 66
Beyond Chains of Thought: Benchmarking Latent-Space Reasoning Abilities in Large Language Models Paper • 2504.10615 • Published Apr 14 • 2
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Paper • 2502.05171 • Published Feb 7 • 151
— UI is a good thing 💅 — Collection cool spaces with a cool UI, what could be better? • 5 items • Updated May 5 • 28
Model Hubs and Beyond: Analyzing Model Popularity, Performance, and Documentation Paper • 2503.15222 • Published Mar 19 • 1
The AI Community Building the Future? A Quantitative Analysis of Development Activity on Hugging Face Hub Paper • 2405.13058 • Published May 20, 2024 • 2
SpaceByte: Towards Deleting Tokenization from Large Language Modeling Paper • 2404.14408 • Published Apr 22, 2024 • 7
T-FREE: Tokenizer-Free Generative LLMs via Sparse Representations for Memory-Efficient Embeddings Paper • 2406.19223 • Published Jun 27, 2024 • 11
Does Time Have Its Place? Temporal Heads: Where Language Models Recall Time-specific Information Paper • 2502.14258 • Published Feb 20 • 26
Foundation Text-Generation Models Below 360M Parameters Collection Great candidates for fine-tuning targeting Wllama and Transformers.js for mobile devices, ordered by number of parameters. • 41 items • Updated Oct 4 • 37