Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2402.17764

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27 • 602
Mixtral of Experts

Paper • 2401.04088 • Published Jan 8 • 159
Mistral 7B

Paper • 2310.06825 • Published Oct 10, 2023 • 47
Don't Make Your LLM an Evaluation Benchmark Cheater

Paper • 2311.01964 • Published Nov 3, 2023 • 1

Language Modeling Is Compression

Paper • 2309.10668 • Published Sep 19, 2023 • 82
Small-scale proxies for large-scale Transformer training instabilities

Paper • 2309.14322 • Published Sep 25, 2023 • 19
Evaluating Cognitive Maps and Planning in Large Language Models with CogEval

Paper • 2309.15129 • Published Sep 25, 2023 • 6
Vision Transformers Need Registers

Paper • 2309.16588 • Published Sep 28, 2023 • 77

CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

Paper • 2309.09400 • Published Sep 17, 2023 • 82
Contrastive Decoding Improves Reasoning in Large Language Models

Paper • 2309.09117 • Published Sep 17, 2023 • 37
FreeU: Free Lunch in Diffusion U-Net

Paper • 2309.11497 • Published Sep 20, 2023 • 64
A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models

Paper • 2309.11674 • Published Sep 20, 2023 • 31

MADLAD-400: A Multilingual And Document-Level Large Audited Dataset

Paper • 2309.04662 • Published Sep 9, 2023 • 22
Neurons in Large Language Models: Dead, N-gram, Positional

Paper • 2309.04827 • Published Sep 9, 2023 • 16
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs

Paper • 2309.05516 • Published Sep 11, 2023 • 9
DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs

Paper • 2309.03907 • Published May 18, 2023 • 8

Previous
1
...
18
19
20
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs