llm_compression - a tuyenTS Collection

tuyenTS 's Collections

multi-modalities

llms

voice

llm_compression

llm_explanation

llm_compression

updated Mar 7

Shortened LLaMA: A Simple Depth Pruning for Large Language Models

Paper • 2402.02834 • Published Feb 5 • 14
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs

Paper • 2402.04291 • Published Feb 6 • 48
PB-LLM: Partially Binarized Large Language Models

Paper • 2310.00034 • Published Sep 29, 2023 • 1
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect

Paper • 2403.03853 • Published Mar 6 • 61