EntRGi: Entropy Aware Reward Guidance for Diffusion Language Models Paper β’ 2602.05000 β’ Published 25 days ago β’ 1
view post Post 1623 Are you familiar with reverse residual connections or looping in language models?Excited to share my Looped-GPT blog post and codebase πhttps://github.com/sanyalsunny111/Looped-GPTTL;DR: looping during pre-training improves generalization.Plot shows GPT2 LMs pre-trained with 15.73B OWT tokensP.S. This is my first post here β I have ~4 followers and zero expectations for reach π See translation 3 replies Β· π§ 6 6 π 3 3 + Reply
Mitigating Intra- and Inter-modal Forgetting in Continual Learning of Unified Multimodal Models Paper β’ 2512.03125 β’ Published Dec 2, 2025 β’ 2
UniQL: Unified Quantization and Low-rank Compression for Adaptive Edge LLMs Paper β’ 2512.03383 β’ Published Dec 3, 2025 β’ 5
ClinicalBench: Can LLMs Beat Traditional ML Models in Clinical Prediction? Paper β’ 2411.06469 β’ Published Nov 10, 2024 β’ 17
Concentration of Measure for Distributions Generated via Diffusion Models Paper β’ 2501.07741 β’ Published Jan 13, 2025
LongProc: Benchmarking Long-Context Language Models on Long Procedural Generation Paper β’ 2501.05414 β’ Published Jan 9, 2025 β’ 2
Rhapsody: A Dataset for Highlight Detection in Podcasts Paper β’ 2505.19429 β’ Published May 26, 2025 β’ 1
ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models Paper β’ 2505.13444 β’ Published May 19, 2025 β’ 17
Quamba2: A Robust and Scalable Post-training Quantization Framework for Selective State Space Models Paper β’ 2503.22879 β’ Published Mar 28, 2025 β’ 9
Quamba: A Post-Training Quantization Recipe for Selective State Space Models Paper β’ 2410.13229 β’ Published Oct 17, 2024 β’ 1
Efficient Low-rank Backpropagation for Vision Transformer Adaptation Paper β’ 2309.15275 β’ Published Sep 26, 2023 β’ 1
MobileTL: On-device Transfer Learning with Inverted Residual Blocks Paper β’ 2212.03246 β’ Published Dec 5, 2022 β’ 1
Scaling Rich Style-Prompted Text-to-Speech Datasets Paper β’ 2503.04713 β’ Published Mar 6, 2025 β’ 1
Automating Human Tutor-Style Programming Feedback: Leveraging GPT-4 Tutor Model for Hint Generation and GPT-3.5 Student Model for Hint Validation Paper β’ 2310.03780 β’ Published Oct 5, 2023
TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models Paper β’ 2502.06608 β’ Published Feb 10, 2025 β’ 39