Update README.md
Browse files
README.md
CHANGED
|
@@ -11,7 +11,7 @@ datasets:
|
|
| 11 |
---
|
| 12 |
|
| 13 |
> [!NOTE]
|
| 14 |
-
> For full information, go check out the Dr Tulu paper [here](
|
| 15 |
|
| 16 |
<img src="https://huggingface.co/rl-research/DR-Tulu-SFT-8B/resolve/main/dr_tulu_logo.png" alt="Figure 1" width="500"/>
|
| 17 |
|
|
@@ -50,7 +50,7 @@ We provide evaluation instructions in the [dr-agent-lib github](TODO).
|
|
| 50 |
| [DR-Tulu-SFT-8B](https://huggingface.co/rl-research/DR-Tulu-SFT-8B) (**this model**) | 72.3 | 38.1 | 68.5 | 39.0 | 75.5 | 66.5 | 31.9 | 56.0 |
|
| 51 |
| [DR-Tulu-8B](https://huggingface.co/rl-research/DR-Tulu-8B) | **86.7** | **43.7** | **71.1** | **41.8** | **80.1** | **68.0** | **39.1** | **61.5** |
|
| 52 |
|
| 53 |
-
For more baselines, explanations of this table, and analysis of
|
| 54 |
|
| 55 |
# Intended uses & limitations
|
| 56 |
|
|
@@ -86,7 +86,7 @@ For futher details, check out the [Dr Tulu paper](https://arxiv.org/abs/TODO).
|
|
| 86 |
```
|
| 87 |
@article{drtulu,
|
| 88 |
title = {{DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research}},
|
| 89 |
-
author = {{Rulin Shao, Akari Asai, Shannon Shen, Hamish Ivison, Varsha Kishore, Jingming Zhuo, Xinran Zhao, Molly Park, David Sontag, Tyler Murray,
|
| 90 |
journal={arXiv preprint TODO}
|
| 91 |
year = {2025},
|
| 92 |
}
|
|
|
|
| 11 |
---
|
| 12 |
|
| 13 |
> [!NOTE]
|
| 14 |
+
> For full information, go check out the Dr Tulu paper [here](http://allenai-web/papers/drtulu).
|
| 15 |
|
| 16 |
<img src="https://huggingface.co/rl-research/DR-Tulu-SFT-8B/resolve/main/dr_tulu_logo.png" alt="Figure 1" width="500"/>
|
| 17 |
|
|
|
|
| 50 |
| [DR-Tulu-SFT-8B](https://huggingface.co/rl-research/DR-Tulu-SFT-8B) (**this model**) | 72.3 | 38.1 | 68.5 | 39.0 | 75.5 | 66.5 | 31.9 | 56.0 |
|
| 51 |
| [DR-Tulu-8B](https://huggingface.co/rl-research/DR-Tulu-8B) | **86.7** | **43.7** | **71.1** | **41.8** | **80.1** | **68.0** | **39.1** | **61.5** |
|
| 52 |
|
| 53 |
+
For more baselines, explanations of this table, and analysis of results, check out the [Dr Tulu paper](https://arxiv.org/abs/TODO)!
|
| 54 |
|
| 55 |
# Intended uses & limitations
|
| 56 |
|
|
|
|
| 86 |
```
|
| 87 |
@article{drtulu,
|
| 88 |
title = {{DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research}},
|
| 89 |
+
author = {{Rulin Shao, Akari Asai, Shannon Shen, Hamish Ivison, Varsha Kishore, Jingming Zhuo, Xinran Zhao, Molly Park, Sam Finlayson, David Sontag, Tyler Murray, Sewon Min, Pradeep Dasigi, Luca Soldani, Faeze Brahman, Scott Yih, Sherry Tongshuang Wu, Luke Zettlemoyer, Yoon Kim, Hanna Hajishirzi, Pang Wei Koh}},
|
| 90 |
journal={arXiv preprint TODO}
|
| 91 |
year = {2025},
|
| 92 |
}
|