Architecture-Agnostic Masked Image Modeling -- From ViT back to CNN Paper • 2205.13943 • Published May 27, 2022 • 1
Cascade-DETR: Delving into High-Quality Universal Object Detection Paper • 2307.11035 • Published Jul 20, 2023
Behavior Contrastive Learning for Unsupervised Skill Discovery Paper • 2305.04477 • Published May 8, 2023
Rethinking Memory and Communication Cost for Efficient Large Language Model Training Paper • 2310.06003 • Published Oct 9, 2023 • 2
SemiReward: A General Reward Model for Semi-supervised Learning Paper • 2310.03013 • Published Oct 4, 2023 • 1
LongVQ: Long Sequence Modeling with Vector Quantization on Structured Memory Paper • 2404.11163 • Published Apr 17
Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts Paper • 2405.19893 • Published May 30 • 29
Short-Long Convolutions Help Hardware-Efficient Linear Attention to Focus on Long Sequences Paper • 2406.08128 • Published Jun 12
Peer Review as A Multi-Turn and Long-Context Dialogue with Role-Based Interactions Paper • 2406.05688 • Published Jun 9