Active Learners as Efficient PRP Rerankers
Abstract
Pairwise ranking prompting is reformulated as active learning from noisy comparisons, with improved rankers that enhance ranking quality under call constraints and address position bias through a randomized oracle.
Pairwise Ranking Prompting (PRP) elicits pairwise preference judgments from an LLM, which are then aggregated into a ranking, usually via classical sorting algorithms. However, judgments are noisy, order-sensitive, and sometimes intransitive, so sorting assumptions do not match the setting. Because sorting aims to recover a full permutation, truncating it to meet a call budget does not produce a dependable top-K. We thus reframe PRP reranking as active learning from noisy pairwise comparisons and show that active rankers are drop-in replacements that improve NDCG@10 per call in the call-constrained regime. Our noise-robust framework also introduces a randomized-direction oracle that uses a single LLM call per pair. This approach converts systematic position bias into zero-mean noise, enabling unbiased aggregate ranking without the cost of bidirectional calls.
Community
Pairwise Ranking Prompting (PRP) elicits pairwise preference judgments from an LLM, which are then aggregated into a ranking, usually via classical sorting algorithms. However, judgments are noisy, order-sensitive, and sometimes intransitive, so sorting assumptions do not match the setting. Because sorting aims to recover a full permutation, truncating it to meet a call budget does not produce a dependable top-K. We thus reframe PRP reranking as active learning from noisy pairwise comparisons and show that active rankers are drop-in replacements that improve NDCG@10 per call in the call-constrained regime. Our noise-robust framework also introduces a randomized-direction oracle that uses a single LLM call per pair. This approach converts systematic position bias into zero-mean noise, enabling unbiased aggregate ranking without the cost of bidirectional calls.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Learning from Emptiness: De-biasing Listwise Rerankers with Content-Agnostic Probability Calibration (2026)
- BracketRank: Large Language Model Document Ranking via Reasoning-based Competitive Elimination (2026)
- CAPS: Cascaded Adaptive Pairwise Selection for Efficient Parallel Reasoning (2026)
- CAR: Query-Guided Confidence-Aware Reranking for Retrieval-Augmented Generation (2026)
- OpenDeepThink: Parallel Reasoning via Bradley--Terry Aggregation (2026)
- Stop Overthinking: Unlocking Efficient Listwise Reranking with Minimal Reasoning (2026)
- ResRank: Unifying Retrieval and Listwise Reranking via End-to-End Joint Training with Residual Passage Compression (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
the randomized-direction oracle is a surprisingly clean trick. by flipping the comparison direction with a single llm call per pair, they turn systematic position bias into zero-mean noise, which lets you get unbiased aggregate rankings without paying for bidirectional judgments. that lines up nicely with the active-learning loop since the noise model stays predictable enough to guide informative pair selection. one edge case to stress test would be asymmetric bias or heavy domain-specific noise—would the zero-mean assumption still hold there? the arxivlens breakdown helped me parse the method details and does a nice job unpacking this, e.g. https://arxivlens.com/PaperView/Details/active-learners-as-efficient-prp-rerankers-3611-8adaa181
Get this paper in your agent:
hf papers read 2605.14236 Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper