Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2404.06773

Papers - University - Tsinghua University

Condition-Aware Neural Network for Controlled Image Generation

Paper • 2404.01143 • Published Apr 1 • 11
FlexiDreamer: Single Image-to-3D Generation with FlexiCubes

Paper • 2404.00987 • Published Apr 1 • 21
Advancing LLM Reasoning Generalists with Preference Trees

Paper • 2404.02078 • Published Apr 2 • 44
ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline

Paper • 2404.02893 • Published Apr 3 • 20

Papers - Image - Training

about 3 hours ago

Lexicon-Level Contrastive Visual-Grounding Improves Language Modeling

Paper • 2403.14551 • Published Mar 21 • 2
Adapting LLaMA Decoder to Vision Transformer

Paper • 2404.06773 • Published Apr 10 • 17
Toward a Better Understanding of Fourier Neural Operators: Analysis and Improvement from a Spectral Perspective

Paper • 2404.07200 • Published Apr 10 • 1
An inclusive review on deep learning techniques and their scope in handwriting recognition

Paper • 2404.08011 • Published Apr 10 • 1

RealCustom: Narrowing Real Text Word for Real-Time Open-Domain Text-to-Image Customization

Paper • 2403.00483 • Published Mar 1 • 12
OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on

Paper • 2403.01779 • Published Mar 4 • 28
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers

Paper • 2401.11605 • Published Jan 21 • 21
FiT: Flexible Vision Transformer for Diffusion Model

Paper • 2402.12376 • Published Feb 19 • 48

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6 • 25
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6 • 12
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7 • 38
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7 • 19

Daily paper that is inspiring (abstract is enough)

World Model on Million-Length Video And Language With RingAttention

Paper • 2402.08268 • Published Feb 13 • 36
Improving Text Embeddings with Large Language Models

Paper • 2401.00368 • Published Dec 31, 2023 • 79
Chain-of-Thought Reasoning Without Prompting

Paper • 2402.10200 • Published Feb 15 • 99
FiT: Flexible Vision Transformer for Diffusion Model

Paper • 2402.12376 • Published Feb 19 • 48

meta-llama/Llama-2-70b-chat-hf

Text Generation • Updated Apr 17 • 181k • 2.16k
Adapting LLaMA Decoder to Vision Transformer

Paper • 2404.06773 • Published Apr 10 • 17
XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning

Paper • 2406.08973 • Published Jun 13 • 85

Previous
1
2
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs