File size: 2,907 Bytes
5ad08d7 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 |
# Octo small
This model is trained with a window size of 2, predicting 7-dimensional actions 4 steps into the future using a diffusion policy.
Observations and tasks conform to the following spec:
Observations:
```
{
image_primary: ('batch', 'history_window', 256, 256, 3),
image_wrist: ('batch', 'history_window', 128, 128, 3),
}
```
Tasks:
```
{
image_primary: ('batch', 256, 256, 3),
image_wrist: ('batch', 128, 128, 3),
language_instruction: {
attention_mask: ('batch', 16),
input_ids: ('batch', 16),
},
}
```
At inference, you may pass in any subset of these observation and task keys, with a history window up to 2 timesteps.
This model was trained on a mix of datasets from the Open X-Embodiment dataset
| Dataset | Proportion of batch |
|------------------------------------------------------------|---------------------|
| Fractal (Brohan et al, 2022) | 17.0\% |
| Kuka (Kalashnikov et al, 2018) | 17.0\% |
| Bridge (Walke et al, 2023) | 17.0\% |
| BC-Z (Jang et al, 2022) | 9.1\% |
| Stanford Hydra Dataset (Belkhale et al, 2023) | 6.0\% |
| Language Table~ (Lynch et al, 2023) | 5.9\% |
| Taco Play (Rosete-Beas et al, 2022, Mees et al., 2023) | 3.6\% |
| Furniture Bench Dataset (Heo et al, 2023) | 3.3\% |
| UTAustin Mutex (Shah et al, 2023) | 3.0\% |
| Austin Sailor Dataset (Nasiriany et al, 2022) | 2.9\% |
| Roboturk (Mandlekar et al, 2018) | 2.8\% |
| Toto (Zhou et al, 2023) | 2.4\% |
| Austin Sirius Dataset (Liu et al, 2023) | 2.3\% |
| Berkeley Autolab UR5 (Chen et al) | 1.5\% |
| IAMLab CMU Pickup Insert (Saxena et al, 2023) | 1.2\% |
| Viola (Zhu et al, 2023) | 1.2\% |
| Berkeley Fanuc Manipulation (Zhu et al, 2023) | 1.0\% |
| NYU Franka Play Dataset (Cui et al, 2022) | 0.9\% |
| UCSD Kitchen Dataset (Ge Yan and Wang, 2023) | <0.1\% |
| Jaco Play (Dass et al, 2023) | 0.6\% |
| Berkeley Cable Routing (Luo et al, 2023) | 0.3\% |
| Austin Buds Dataset (Zhu et al, 2022) | 0.3\% |
| CMU Stretch (Mendonca et al, 2023) | 0.2\% |
| NYU Door Opening (Pari et al, 2021) | 0.1\% |
| DLR EDAN Shared Control (Quere et al, 2020) | 0.1\% |
|