-
Thinking Beyond Tokens: From Brain-Inspired Intelligence to Cognitive Foundations for Artificial General Intelligence and its Societal Impact
Paper • 2507.00951 • Published • 22 -
On Path to Multimodal Generalist: General-Level and General-Bench
Paper • 2505.04620 • Published • 83 -
What Has a Foundation Model Found? Using Inductive Bias to Probe for World Models
Paper • 2507.06952 • Published • 7
Collections
Discover the best community collections!
Collections including paper arxiv:2507.00951
-
WorldVLA: Towards Autoregressive Action World Model
Paper • 2506.21539 • Published • 39 -
Fast and Simplex: 2-Simplicial Attention in Triton
Paper • 2507.02754 • Published • 25 -
IntFold: A Controllable Foundation Model for General and Specialized Biomolecular Structure Prediction
Paper • 2507.02025 • Published • 35 -
Thinking Beyond Tokens: From Brain-Inspired Intelligence to Cognitive Foundations for Artificial General Intelligence and its Societal Impact
Paper • 2507.00951 • Published • 22
-
Scaling Laws for Native Multimodal Models Scaling Laws for Native Multimodal Models
Paper • 2504.07951 • Published • 29 -
Have we unified image generation and understanding yet? An empirical study of GPT-4o's image generation ability
Paper • 2504.08003 • Published • 49 -
SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models
Paper • 2504.11468 • Published • 29 -
Towards Learning to Complete Anything in Lidar
Paper • 2504.12264 • Published • 10
-
Dynadiff: Single-stage Decoding of Images from Continuously Evolving fMRI
Paper • 2505.14556 • Published • 1 -
Incorporating brain-inspired mechanisms for multimodal learning in artificial intelligence
Paper • 2505.10176 • Published • 3 -
Meta-Learning an In-Context Transformer Model of Human Higher Visual Cortex
Paper • 2505.15813 • Published • 4 -
Thinking Beyond Tokens: From Brain-Inspired Intelligence to Cognitive Foundations for Artificial General Intelligence and its Societal Impact
Paper • 2507.00951 • Published • 22
-
WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning
Paper • 2411.02337 • Published • 38 -
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models
Paper • 2411.04996 • Published • 52 -
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level
Paper • 2411.03562 • Published • 69 -
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization
Paper • 2410.08815 • Published • 49
-
Thinking Beyond Tokens: From Brain-Inspired Intelligence to Cognitive Foundations for Artificial General Intelligence and its Societal Impact
Paper • 2507.00951 • Published • 22 -
On Path to Multimodal Generalist: General-Level and General-Bench
Paper • 2505.04620 • Published • 83 -
What Has a Foundation Model Found? Using Inductive Bias to Probe for World Models
Paper • 2507.06952 • Published • 7
-
WorldVLA: Towards Autoregressive Action World Model
Paper • 2506.21539 • Published • 39 -
Fast and Simplex: 2-Simplicial Attention in Triton
Paper • 2507.02754 • Published • 25 -
IntFold: A Controllable Foundation Model for General and Specialized Biomolecular Structure Prediction
Paper • 2507.02025 • Published • 35 -
Thinking Beyond Tokens: From Brain-Inspired Intelligence to Cognitive Foundations for Artificial General Intelligence and its Societal Impact
Paper • 2507.00951 • Published • 22
-
Dynadiff: Single-stage Decoding of Images from Continuously Evolving fMRI
Paper • 2505.14556 • Published • 1 -
Incorporating brain-inspired mechanisms for multimodal learning in artificial intelligence
Paper • 2505.10176 • Published • 3 -
Meta-Learning an In-Context Transformer Model of Human Higher Visual Cortex
Paper • 2505.15813 • Published • 4 -
Thinking Beyond Tokens: From Brain-Inspired Intelligence to Cognitive Foundations for Artificial General Intelligence and its Societal Impact
Paper • 2507.00951 • Published • 22
-
Scaling Laws for Native Multimodal Models Scaling Laws for Native Multimodal Models
Paper • 2504.07951 • Published • 29 -
Have we unified image generation and understanding yet? An empirical study of GPT-4o's image generation ability
Paper • 2504.08003 • Published • 49 -
SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models
Paper • 2504.11468 • Published • 29 -
Towards Learning to Complete Anything in Lidar
Paper • 2504.12264 • Published • 10
-
WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning
Paper • 2411.02337 • Published • 38 -
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models
Paper • 2411.04996 • Published • 52 -
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level
Paper • 2411.03562 • Published • 69 -
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization
Paper • 2410.08815 • Published • 49