KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models Paper • 2412.06071 • Published 10 days ago • 7
Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-wise LoRA Paper • 2410.20672 • Published Oct 28 • 6
QUICK: Quantization-aware Interleaving and Conflict-free Kernel for efficient LLM inference Paper • 2402.10076 • Published Feb 15 • 2