MoDE_CALVIN_ABC_3 / output (2).log
mbreuss's picture
Upload output (2).log
5c7a2be verified
[2024-12-16 13:49:03,944][mode.evaluation.multistep_sequences][INFO] - Start generating evaluation sequences.
[2024-12-16 13:49:18,538][mode.evaluation.multistep_sequences][INFO] - Done generating evaluation sequences.
[2024-12-16 13:49:19,208][mode.models.mode_agent][INFO] - Precomputing experts with sampling steps 5
1/5 : 96.2% | 2/5 : 87.9% | 3/5 : 79.2% | 4/5 : 70.5% | 5/5 : 61.8% | Average: 4.0 |: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1000/1000 [2:53:40<00:00, 10.42s/it]
Results for Epoch 0:
Average successful sequence length: 3.956
Success rates for i instructions in a row:
1: 96.2%
2: 87.9%
3: 79.2%
4: 70.5%
5: 61.8%
rotate_blue_block_right: 71 / 74 | SR: 95.9%
move_slider_right: 276 / 276 | SR: 100.0%
lift_red_block_slider: 113 / 129 | SR: 87.6%
place_in_slider: 266 / 348 | SR: 76.4%
turn_off_led: 164 / 167 | SR: 98.2%
push_into_drawer: 95 / 120 | SR: 79.2%
lift_blue_block_drawer: 19 / 19 | SR: 100.0%
close_drawer: 197 / 199 | SR: 99.0%
lift_pink_block_slider: 117 / 129 | SR: 90.7%
open_drawer: 337 / 340 | SR: 99.1%
rotate_red_block_right: 69 / 72 | SR: 95.8%
lift_red_block_table: 170 / 174 | SR: 97.7%
lift_pink_block_table: 155 / 168 | SR: 92.3%
push_blue_block_left: 63 / 68 | SR: 92.6%
turn_off_lightbulb: 130 / 136 | SR: 95.6%
turn_on_led: 169 / 171 | SR: 98.8%
stack_block: 125 / 188 | SR: 66.5%
push_pink_block_right: 33 / 62 | SR: 53.2%
push_red_block_left: 68 / 77 | SR: 88.3%
lift_blue_block_table: 169 / 173 | SR: 97.7%
turn_on_lightbulb: 160 / 173 | SR: 92.5%
rotate_blue_block_left: 64 / 65 | SR: 98.5%
place_in_drawer: 167 / 167 | SR: 100.0%
move_slider_left: 238 / 239 | SR: 99.6%
rotate_red_block_left: 61 / 62 | SR: 98.4%
push_pink_block_left: 69 / 74 | SR: 93.2%
lift_blue_block_slider: 113 / 125 | SR: 90.4%
push_red_block_right: 42 / 70 | SR: 60.0%
lift_pink_block_drawer: 13 / 15 | SR: 86.7%
rotate_pink_block_right: 63 / 65 | SR: 96.9%
unstack_block: 52 / 54 | SR: 96.3%
push_blue_block_right: 39 / 68 | SR: 57.4%
rotate_pink_block_left: 53 / 54 | SR: 98.1%
lift_red_block_drawer: 16 / 17 | SR: 94.1%
Best model: epoch 0 with average sequences length of 3.956