llm - a Dushwe Collection

Dushwe 's Collections

aigc

llm

SSM

llm

updated Feb 21

MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning

Paper • 2310.09478 • Published Oct 14, 2023 • 19
Can GPT models be Financial Analysts? An Evaluation of ChatGPT and GPT-4 on mock CFA Exams

Paper • 2310.08678 • Published Oct 12, 2023 • 12
Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 242
LLaMA: Open and Efficient Foundation Language Models

Paper • 2302.13971 • Published Feb 27, 2023 • 13
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

Paper • 2205.14135 • Published May 27, 2022 • 11
Baichuan 2: Open Large-scale Language Models

Paper • 2309.10305 • Published Sep 19, 2023 • 19
Qwen Technical Report

Paper • 2309.16609 • Published Sep 28, 2023 • 34
Code Llama: Open Foundation Models for Code

Paper • 2308.12950 • Published Aug 24, 2023 • 22
Tuna: Instruction Tuning using Feedback from Large Language Models

Paper • 2310.13385 • Published Oct 20, 2023 • 10
Monolingual or Multilingual Instruction Tuning: Which Makes a Better Alpaca

Paper • 2309.08958 • Published Sep 16, 2023 • 2
Reflection-Tuning: Data Recycling Improves LLM Instruction-Tuning

Paper • 2310.11716 • Published Oct 18, 2023 • 5
AlpaGasus: Training A Better Alpaca with Fewer Data

Paper • 2307.08701 • Published Jul 17, 2023 • 22
XGen-7B Technical Report

Paper • 2309.03450 • Published Sep 7, 2023 • 8
LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing

Paper • 2311.00571 • Published Nov 1, 2023 • 40
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents

Paper • 2311.05437 • Published Nov 9, 2023 • 47
u-LLaVA: Unifying Multi-Modal Tasks via Large Language Model

Paper • 2311.05348 • Published Nov 9, 2023 • 11
To See is to Believe: Prompting GPT-4V for Better Visual Instruction Tuning

Paper • 2311.07574 • Published Nov 13, 2023 • 14
Trusted Source Alignment in Large Language Models

Paper • 2311.06697 • Published Nov 12, 2023 • 10
Gemini vs GPT-4V: A Preliminary Comparison and Combination of Vision-Language Models Through Qualitative Cases

Paper • 2312.15011 • Published Dec 22, 2023 • 15
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling

Paper • 2312.15166 • Published Dec 23, 2023 • 56
GPT-4V(ision) is a Generalist Web Agent, if Grounded

Paper • 2401.01614 • Published Jan 3 • 21
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action

Paper • 2312.17172 • Published Dec 28, 2023 • 26
Mixtral of Experts

Paper • 2401.04088 • Published Jan 8 • 159
MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts

Paper • 2401.04081 • Published Jan 8 • 71
SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents

Paper • 2401.10935 • Published Jan 17 • 4
Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception

Paper • 2401.16158 • Published Jan 29 • 18
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models

Paper • 2401.15947 • Published Jan 29 • 49
How to Train Data-Efficient LLMs

Paper • 2402.09668 • Published Feb 15 • 40
Video ReCap: Recursive Captioning of Hour-Long Videos

Paper • 2402.13250 • Published Feb 20 • 25