ScalerLab

community

AI & ML interests

None defined yet.

ScalerLab's activity

kylemontgomery

updated a Space 3 months ago

JudgeBench Leaderboard

sijuntan

authored a paper 3 months ago

JudgeBench: A Benchmark for Evaluating LLM-based Judges

Paper • 2410.12784 • Published Oct 16, 2024 • 44

kylemontgomery

authored 2 papers 4 months ago

Re-Tuning: Overcoming the Compositionality Limits of Large Language Models with Recursive Tuning

Paper • 2407.04787 • Published Jul 5, 2024

JudgeBench: A Benchmark for Evaluating LLM-based Judges

Paper • 2410.12784 • Published Oct 16, 2024 • 44

kylemontgomery

updated a dataset 4 months ago

ScalerLab/JudgeBench

Viewer • Updated Oct 9, 2024 • 620 • 249 • 4

sijuntan

authored a paper 10 months ago

LLoCO: Learning Long Contexts Offline

Paper • 2404.07979 • Published Apr 11, 2024 • 21

kylemontgomery

authored a paper over 1 year ago

Agent Instructs Large Language Models to be General Zero-Shot Reasoners

Paper • 2310.03710 • Published Oct 5, 2023 • 2