-
Intelligence at the Edge of Chaos
Paper • 2410.02536 • Published • 5 -
Large Language Models as Markov Chains
Paper • 2410.02724 • Published • 31 -
Learning the Latent Rules of a Game from Data: A Chess Story
Paper • 2410.02426 • Published • 5 -
Quantifying Generalization Complexity for Large Language Models
Paper • 2410.01769 • Published • 13
Collections
Discover the best community collections!
Collections including paper arxiv:2410.02724
-
Large Language Models as Markov Chains
Paper • 2410.02724 • Published • 31 -
Loong: Generating Minute-level Long Videos with Autoregressive Language Models
Paper • 2410.02757 • Published • 36 -
LLaVA-Critic: Learning to Evaluate Multimodal Models
Paper • 2410.02712 • Published • 34 -
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration
Paper • 2410.02367 • Published • 45
-
KAN: Kolmogorov-Arnold Networks
Paper • 2404.19756 • Published • 108 -
The Platonic Representation Hypothesis
Paper • 2405.07987 • Published • 2 -
The No Free Lunch Theorem, Kolmogorov Complexity, and the Role of Inductive Biases in Machine Learning
Paper • 2304.05366 • Published • 1 -
Explaining NonLinear Classification Decisions with Deep Taylor Decomposition
Paper • 1512.02479 • Published • 1
-
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval
Paper • 2409.10516 • Published • 37 -
Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse
Paper • 2409.11242 • Published • 5 -
Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models
Paper • 2409.11136 • Published • 21 -
On the Diagram of Thought
Paper • 2409.10038 • Published • 11
-
A Comparative Study on Automatic Coding of Medical Letters with Explainability
Paper • 2407.13638 • Published • 5 -
Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence
Paper • 2407.07061 • Published • 26 -
AgentInstruct: Toward Generative Teaching with Agentic Flows
Paper • 2407.03502 • Published • 44 -
Graph-Based Captioning: Enhancing Visual Descriptions by Interconnecting Region Captions
Paper • 2407.06723 • Published • 10
-
Why do Learning Rates Transfer? Reconciling Optimization and Scaling Limits for Deep Learning
Paper • 2402.17457 • Published -
Curvature-Informed SGD via General Purpose Lie-Group Preconditioners
Paper • 2402.04553 • Published -
TextGrad: Automatic "Differentiation" via Text
Paper • 2406.07496 • Published • 26 -
Surge Phenomenon in Optimal Learning Rate and Batch Size Scaling
Paper • 2405.14578 • Published