Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
- Website
- Community
- Solutions
Log In
Sign Up

Danil's picture

4 1

Danil

Potatochka

imumtozee's profile picture

artfawl's profile picture

·

AI & ML interests

None yet

Organizations

None yet

Collections 2

LongVILA: Scaling Long-Context Visual Language Models for Long Videos

Paper • 2408.10188 • Published Aug 19, 2024 • 52
xGen-MM (BLIP-3): A Family of Open Large Multimodal Models

Paper • 2408.08872 • Published Aug 16, 2024 • 101
Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22, 2024 • 134
Show-o: One Single Transformer to Unify Multimodal Understanding and Generation

Paper • 2408.12528 • Published Aug 22, 2024 • 51

Attention Heads of Large Language Models: A Survey

Paper • 2409.03752 • Published Sep 5, 2024 • 92
Transformer Explainer: Interactive Learning of Text-Generative Models

Paper • 2408.04619 • Published Aug 8, 2024 • 175
Addition is All You Need for Energy-efficient Language Models

Paper • 2410.00907 • Published Oct 1, 2024 • 151
DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining

Paper • 2305.10429 • Published May 17, 2023 • 5

LongVILA: Scaling Long-Context Visual Language Models for Long Videos

Paper • 2408.10188 • Published Aug 19, 2024 • 52
xGen-MM (BLIP-3): A Family of Open Large Multimodal Models

Paper • 2408.08872 • Published Aug 16, 2024 • 101
Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22, 2024 • 134
Show-o: One Single Transformer to Unify Multimodal Understanding and Generation

Paper • 2408.12528 • Published Aug 22, 2024 • 51

Attention Heads of Large Language Models: A Survey

Paper • 2409.03752 • Published Sep 5, 2024 • 92
Transformer Explainer: Interactive Learning of Text-Generative Models

Paper • 2408.04619 • Published Aug 8, 2024 • 175
Addition is All You Need for Energy-efficient Language Models

Paper • 2410.00907 • Published Oct 1, 2024 • 151
DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining

Paper • 2305.10429 • Published May 17, 2023 • 5

models 0

None public yet

datasets 0

None public yet

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs