diff --git "a/app/src/index.html" "b/app/src/index.html" --- "a/app/src/index.html" +++ "b/app/src/index.html" @@ -8,6 +8,7 @@ Scaling test-time compute for open models: How we implemented DeepMind’s compute-optimal recipe to solve hard math problems like OpenAI’s o1 + @@ -55,7 +56,7 @@

Strategies for test-time compute scaling

There are two main strategies for scaling test-time compute:

In this blog post, we’ll concentrate on search-based methods as they represent a practical and scalable solution for test-time compute optimization. In particular, we’ll examine the three strategies illustrated below:

With an understanding of the key search strategies, let’s move on to how we evaluated them in practice.

-

Experimental setup

As illustrated in the diagram above, our experimental setup involves a pipeline with the following steps:

  1. We begin by feeding a math problem to an LLM, which generates NN partial solutions, e.g. an intermediate step in a derivation.
  1. Each step is scored by a PRM, which estimates the probability of each step to eventually reach the correct final answer.
    1. The steps and PRM scores are then used by a given search strategy to select which partial solutions should be further explored to generate the next round of intermediate steps.
  1. Once the search strategy terminates, the final candidate solutions are ranked by the PRM to produce the final answer.

To compare various search strategies, we used the following open models and datasets: