Jiannan Huang's picture

1 4 11

Jiannan Huang

Rbrq

·

http://rbrq03.github.io

AI & ML interests

None yet

Recent Activity

liked a model 8 days ago

black-forest-labs/FLUX.1-dev

liked a Space 2 months ago

amirgame197/Image-to-Drawing

liked a model 4 months ago

Qwen/Qwen2-VL-2B-Instruct

View all activity

Organizations

None yet

Rbrq's activity

liked a model 8 days ago

black-forest-labs/FLUX.1-dev

Text-to-Image • Updated Aug 16, 2024 • 1.28M • • 8.08k

liked a Space 2 months ago

Image to Drawing

liked a model 4 months ago

Qwen/Qwen2-VL-2B-Instruct

Image-Text-to-Text • Updated 6 days ago • 1.91M • 373

liked a Space 4 months ago

Running on CPU Upgrade

Open VLM Leaderboard

VLMEvalKit Evaluation Results Collection

liked a model 4 months ago

allenai/Molmo-7B-D-0924

Image-Text-to-Text • Updated Oct 10, 2024 • 585k • 496

liked a dataset 5 months ago

THUDM/ImageRewardDB

Updated Jun 21, 2023 • 594 • 37

upvoted a paper 5 months ago

Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders

Paper • 2408.15998 • Published Aug 28, 2024 • 85

upvoted an article 5 months ago

Article

MobileNet Baselines

By

•

Jul 26, 2024

• 23

liked a model 5 months ago

CiaraRowles/IP-Adapter-Instruct

Image-to-Image • Updated Aug 13, 2024 • 96 • 49

updated a model 6 months ago

Rbrq/detr-finetuned-cppe-5-10k-steps

Updated Jul 14, 2024

reacted to merve's post with 🔥 6 months ago

Post

5085

Real-time DEtection Transformer (RT-DETR) landed in transformers 🤩 with Apache 2.0 license 😍

🔖 models: https://huggingface.co/PekingU
🔖 demo: merve/RT-DETR-tracking-coco
📝 paper: DETRs Beat YOLOs on Real-time Object Detection (2304.08069)
📖 notebook: https://github.com/merveenoyan/example_notebooks/blob/main/RT_DETR_Notebook.ipynb

YOLO models are known to be super fast for real-time computer vision, but they have a downside with being volatile to NMS 🥲

Transformer-based models on the other hand are computationally not as efficient 🥲

Isn't there something in between? Enter RT-DETR!

The authors combined CNN backbone, multi-stage hybrid decoder (combining convs and attn) with a transformer decoder. In the paper, authors also claim one can adjust speed by changing decoder layers without retraining altogether.
The authors find out that the model performs better in terms of speed and accuracy compared to the previous state-of-the-art. 🤩

updated a model 6 months ago

Rbrq/detr_finetuned_cppe5

Object Detection • Updated Jul 8, 2024 • 21

upvoted 2 papers 7 months ago

InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation

Paper • 2404.19427 • Published Apr 30, 2024 • 72

MUMU: Bootstrapping Multimodal Image Generation from Text-to-Image Data

Paper • 2406.18790 • Published Jun 26, 2024 • 34

authored a paper 7 months ago

ClassDiffusion: More Aligned Personalization Tuning with Explicit Class Guidance

Paper • 2405.17532 • Published May 27, 2024

liked a dataset 9 months ago

UCSC-VLAA/HQ-Edit

Viewer • Updated Apr 17, 2024 • 1.64k • 1.72k • 23

liked a model about 1 year ago

openai/consistency-decoder

Updated Nov 9, 2023 • 155 • 48

liked a Space about 1 year ago

Detic+ChatGPT

updated a model about 1 year ago

Rbrq/car_1018

Updated Oct 19, 2023

updated a model over 1 year ago

Rbrq/dreambooth_dog

Text-to-Image • Updated Sep 23, 2023 • 4