Pretraining Data Mixtures Enable Narrow Model Selection Capabilities in Transformer Models Paper • 2311.00871 • Published Nov 1, 2023 • 2
Data Distributional Properties Drive Emergent In-Context Learning in Transformers Paper • 2205.05055 • Published Apr 22, 2022 • 2
WILBUR: Adaptive In-Context Learning for Robust and Accurate Web Agents Paper • 2404.05902 • Published Apr 8 • 20
LLM-R2: A Large Language Model Enhanced Rule-based Rewrite System for Boosting Query Efficiency Paper • 2404.12872 • Published Apr 19 • 11
What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation Paper • 2404.07129 • Published Apr 10 • 3
pyvene: A Library for Understanding and Improving PyTorch Models via Interventions Paper • 2403.07809 • Published Mar 12 • 1
Emergence of Hidden Capabilities: Exploring Learning Dynamics in Concept Space Paper • 2406.19370 • Published Jun 27 • 1
In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering Paper • 2311.06668 • Published Nov 11, 2023 • 5