Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies Paper • 2407.13623 • Published Jul 18 • 52
Learning to Refuse: Towards Mitigating Privacy Risks in LLMs Paper • 2407.10058 • Published Jul 14 • 29
NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window? Paper • 2407.11963 • Published Jul 16 • 43
view article Article Experimenting with Automatic PII Detection on the Hub using Presidio Jul 10 • 24
Scaling Synthetic Data Creation with 1,000,000,000 Personas Paper • 2406.20094 • Published Jun 28 • 94
Harnessing large-language models to generate private synthetic text Paper • 2306.01684 • Published Jun 2, 2023 • 3