Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2410.02724

Intelligence at the Edge of Chaos

Paper • 2410.02536 • Published Oct 3 • 5
Large Language Models as Markov Chains

Paper • 2410.02724 • Published Oct 3 • 31
Learning the Latent Rules of a Game from Data: A Chess Story

Paper • 2410.02426 • Published Oct 3 • 5
Quantifying Generalization Complexity for Large Language Models

Paper • 2410.01769 • Published Oct 2 • 13

Large Language Models as Markov Chains

Paper • 2410.02724 • Published Oct 3 • 31

Large Language Models as Markov Chains

Paper • 2410.02724 • Published Oct 3 • 31
Loong: Generating Minute-level Long Videos with Autoregressive Language Models

Paper • 2410.02757 • Published Oct 3 • 36
LLaVA-Critic: Learning to Evaluate Multimodal Models

Paper • 2410.02712 • Published Oct 3 • 34
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration

Paper • 2410.02367 • Published Oct 3 • 45

Large Language Models as Markov Chains

Paper • 2410.02724 • Published Oct 3 • 31

Large Language Models as Markov Chains

Paper • 2410.02724 • Published Oct 3 • 31
Training Language Models on Synthetic Edit Sequences Improves Code Synthesis

Paper • 2410.02749 • Published Oct 3 • 12

KAN: Kolmogorov-Arnold Networks

Paper • 2404.19756 • Published Apr 30 • 108
The Platonic Representation Hypothesis

Paper • 2405.07987 • Published May 13 • 2
The No Free Lunch Theorem, Kolmogorov Complexity, and the Role of Inductive Biases in Machine Learning

Paper • 2304.05366 • Published Apr 11, 2023 • 1
Explaining NonLinear Classification Decisions with Deep Taylor Decomposition

Paper • 1512.02479 • Published Dec 8, 2015 • 1

about 8 hours ago

RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval

Paper • 2409.10516 • Published Sep 16 • 37
Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse

Paper • 2409.11242 • Published Sep 17 • 5
Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models

Paper • 2409.11136 • Published Sep 17 • 21
On the Diagram of Thought

Paper • 2409.10038 • Published Sep 16 • 11

A Comparative Study on Automatic Coding of Medical Letters with Explainability

Paper • 2407.13638 • Published Jul 18 • 5
Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence

Paper • 2407.07061 • Published Jul 9 • 26
AgentInstruct: Toward Generative Teaching with Agentic Flows

Paper • 2407.03502 • Published Jul 3 • 44
Graph-Based Captioning: Enhancing Visual Descriptions by Interconnecting Region Captions

Paper • 2407.06723 • Published Jul 9 • 10

Alignment Studio: Aligning Large Language Models to Particular Contextual Regulations

Paper • 2403.09704 • Published Mar 8 • 31
Large Language Models as Markov Chains

Paper • 2410.02724 • Published Oct 3 • 31

Why do Learning Rates Transfer? Reconciling Optimization and Scaling Limits for Deep Learning

Paper • 2402.17457 • Published Feb 27
Curvature-Informed SGD via General Purpose Lie-Group Preconditioners

Paper • 2402.04553 • Published Feb 7
TextGrad: Automatic "Differentiation" via Text

Paper • 2406.07496 • Published Jun 11 • 26
Surge Phenomenon in Optimal Learning Rate and Batch Size Scaling

Paper • 2405.14578 • Published May 23

Previous
1
2
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs