Vasu Sharma's picture

Vasu Sharma

vasusharma55

·

AI & ML interests

None yet

Recent Activity

authored a paper about 14 hours ago

DINOv2: Learning Robust Visual Features without Supervision

authored a paper about 14 hours ago

Demystifying CLIP Data

authored a paper about 14 hours ago

E-ViLM: Efficient Video-Language Model via Masked Video Modeling with Semantic Vector-Quantized Tokenizer

View all activity

Organizations

None yet

vasusharma55's activity

authored 8 papers about 14 hours ago

DINOv2: Learning Robust Visual Features without Supervision

Paper • 2304.07193 • Published Apr 14, 2023 • 5

Demystifying CLIP Data

Paper • 2309.16671 • Published Sep 28, 2023 • 20

E-ViLM: Efficient Video-Language Model via Masked Video Modeling with Semantic Vector-Quantized Tokenizer

Paper • 2311.17267 • Published Nov 28, 2023 • 1

A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions

Paper • 2312.08578 • Published Dec 14, 2023 • 16

Alexa, play with robot: Introducing the First Alexa Prize SimBot Challenge on Embodied AI

Paper • 2308.05221 • Published Aug 9, 2023 • 9

Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning

Paper • 2309.02591 • Published Sep 5, 2023 • 14

Text Quality-Based Pruning for Efficient Training of Language Models

Paper • 2405.01582 • Published Apr 26, 2024

The Brittleness of AI-Generated Image Watermarking Techniques: Examining Their Robustness Against Visual Paraphrasing Attacks

Paper • 2408.10446 • Published Aug 19, 2024 • 6

authored a paper 1 day ago

DPO Kernels: A Semantically-Aware, Kernel-Enhanced, and Divergence-Rich Paradigm for Direct Preference Optimization

Paper • 2501.03271 • Published 6 days ago • 7

authored a paper 8 months ago

An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published May 27, 2024 • 87

authored a paper 10 months ago

Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM

Paper • 2403.07816 • Published Mar 12, 2024 • 39

authored a paper about 1 year ago

FLAP: Fast Language-Audio Pre-training

Paper • 2311.01615 • Published Nov 2, 2023 • 16