Fix href
Browse files- app/src/index.html +1 -1
app/src/index.html
CHANGED
@@ -61,7 +61,7 @@
|
|
61 |
</d-contents>
|
62 |
|
63 |
<!-- INTRODUCTION -->
|
64 |
-
<p>Over the last few years, the scaling of <em><strong>train-time compute</strong></em> has dominated the progress of large language models (LLMs).<d-footnote>Here, train-time compute refers to increasing model size, dataset size, and compute budgets in line with <a href="https://huggingface.co/papers/2001.08361">scaling laws.</d-footnote>Although this paradigm has proven to be remarkably effective, the resources needed to pretrain ever larger models are becoming prohibitively expensive, with <a href="https://youtu.be/WXhikNA5PIc?feature=shared">billion-dollar clusters</a> already on the horizon.<d-footnote>Aside from compute resources, Ilya Sutskever has made the <a href="https://www.youtube.com/watch?feature=shared&t=475&v=1yvBqasHLZs">provocative analogy</a> that pretraining data is the “fossil fuel of AI” and that pretraining as we know it will end once this resource is exhausted in the near future.</d-footnote> This trend has sparked significant interest in a complementary approach: <em><strong>test-time compute scaling</strong></em>. Rather than relying on ever-larger pretraining budgets, test-time methods use dynamic inference strategies that allow models to “think longer” on harder problems. A prominent example is <a href="https://openai.com/index/learning-to-reason-with-llms/">OpenAI’s o1 model</a>, which shows consistent improvement on difficult math problems as one increases the amount of test-time compute:</p>
|
65 |
|
66 |
<figure id="1581384e-bcac-805f-8c2b-dff4509f45cb" class="image"><a href="https://huggingface.co/datasets/HuggingFaceH4/blogpost-images/resolve/main/compute.png.webp"><img style="width:672px" src="https://huggingface.co/datasets/HuggingFaceH4/blogpost-images/resolve/main/compute.png.webp"/></a></figure>
|
67 |
|
|
|
61 |
</d-contents>
|
62 |
|
63 |
<!-- INTRODUCTION -->
|
64 |
+
<p>Over the last few years, the scaling of <em><strong>train-time compute</strong></em> has dominated the progress of large language models (LLMs).<d-footnote>Here, train-time compute refers to increasing model size, dataset size, and compute budgets in line with <a href="https://huggingface.co/papers/2001.08361">scaling laws.</a></d-footnote>Although this paradigm has proven to be remarkably effective, the resources needed to pretrain ever larger models are becoming prohibitively expensive, with <a href="https://youtu.be/WXhikNA5PIc?feature=shared">billion-dollar clusters</a> already on the horizon.<d-footnote>Aside from compute resources, Ilya Sutskever has made the <a href="https://www.youtube.com/watch?feature=shared&t=475&v=1yvBqasHLZs">provocative analogy</a> that pretraining data is the “fossil fuel of AI” and that pretraining as we know it will end once this resource is exhausted in the near future.</d-footnote> This trend has sparked significant interest in a complementary approach: <em><strong>test-time compute scaling</strong></em>. Rather than relying on ever-larger pretraining budgets, test-time methods use dynamic inference strategies that allow models to “think longer” on harder problems. A prominent example is <a href="https://openai.com/index/learning-to-reason-with-llms/">OpenAI’s o1 model</a>, which shows consistent improvement on difficult math problems as one increases the amount of test-time compute:</p>
|
65 |
|
66 |
<figure id="1581384e-bcac-805f-8c2b-dff4509f45cb" class="image"><a href="https://huggingface.co/datasets/HuggingFaceH4/blogpost-images/resolve/main/compute.png.webp"><img style="width:672px" src="https://huggingface.co/datasets/HuggingFaceH4/blogpost-images/resolve/main/compute.png.webp"/></a></figure>
|
67 |
|