Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2410.24024

Interesting Papers

These papers are interesting (to me)

Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models

Paper • 2410.02740 • Published Oct 3 • 52
From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging

Paper • 2410.01215 • Published Oct 2 • 30
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

Paper • 2409.17146 • Published Sep 25 • 103
EuroLLM: Multilingual Language Models for Europe

Paper • 2409.16235 • Published Sep 24 • 24

AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents

Paper • 2410.24024 • Published 21 days ago • 48

Paperse - Mobile - Android

AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents

Paper • 2410.24024 • Published 21 days ago • 48

about 18 hours ago

LinFusion: 1 GPU, 1 Minute, 16K Image

Paper • 2409.02097 • Published Sep 3 • 32
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion

Paper • 2409.11406 • Published Sep 17 • 25
Diffusion Models Are Real-Time Game Engines

Paper • 2408.14837 • Published Aug 27 • 121
Segment Anything with Multiple Modalities

Paper • 2408.09085 • Published Aug 17 • 21

ODA: Observation-Driven Agent for integrating LLMs and Knowledge Graphs

Paper • 2404.07677 • Published Apr 11 • 1
ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models

Paper • 2404.07738 • Published Apr 11 • 2
Scaling Instructable Agents Across Many Simulated Worlds

Paper • 2404.10179 • Published Mar 13 • 26
A Multimodal Automated Interpretability Agent

Paper • 2404.14394 • Published Apr 22 • 20

AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning

Paper • 2402.15506 • Published Feb 23 • 14
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent

Paper • 2404.03648 • Published Apr 4 • 24
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts

Paper • 2405.19893 • Published May 30 • 29
Parrot: Efficient Serving of LLM-based Applications with Semantic Variable

Paper • 2405.19888 • Published May 30 • 6

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6 • 25
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6 • 12
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7 • 38
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7 • 19

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs