AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling Paper • 2402.12226 • Published Feb 19 • 41
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients Paper • 2407.08296 • Published Jul 11 • 31
Contrastive Prefence Learning: Learning from Human Feedback without RL Paper • 2310.13639 • Published Oct 20, 2023 • 24
Preference Datasets for DPO Collection This collection contains a list of curated preference datasets for DPO fine-tuning for intent alignment of LLMs • 7 items • Updated Jul 30 • 30
Korean-Adapted Model Series Collection Korean-adapted Language Model Series • 13 items • Updated May 17 • 24
MambaMixer: Efficient Selective State Space Models with Dual Token and Channel Selection Paper • 2403.19888 • Published Mar 29 • 10
YaRN: Efficient Context Window Extension of Large Language Models Paper • 2309.00071 • Published Aug 31, 2023 • 65
The Data Provenance Initiative: A Large Scale Audit of Dataset Licensing & Attribution in AI Paper • 2310.16787 • Published Oct 25, 2023 • 5
Sora Reference Papers Collection A collection of all papers referenced in OpenAI's "Video generation models as world simulators" technical report • openai.com/sora • 30 items • Updated Oct 3 • 52
Zephyr 7B Collection Models, datasets, and demos associated with Zephyr 7B. For code to train the models, see: https://github.com/huggingface/alignment-handbook • 9 items • Updated Apr 12 • 145
Hardwiring ViT Patch Selectivity into CNNs using Patch Mixing Paper • 2306.17848 • Published Jun 30, 2023 • 8
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Paper • 2211.05100 • Published Nov 9, 2022 • 28
Platypus: Quick, Cheap, and Powerful Refinement of LLMs Paper • 2308.07317 • Published Aug 14, 2023 • 23