Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 3 items • Updated 4 days ago • 290
view article Article How I train a LoRA: m3lt style training overview By alvdansen • Jul 1, 2024 • 49
Lost in the Middle: How Language Models Use Long Contexts Paper • 2307.03172 • Published Jul 6, 2023 • 38
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence Paper • 2401.14196 • Published Jan 25, 2024 • 54
Efficient RLHF: Reducing the Memory Usage of PPO Paper • 2309.00754 • Published Sep 1, 2023 • 14