pi0.5 RLT Build Block Tower 6-Mix โ€” Joints-Only

RLT (RL Token) encoder-decoder trained on top of the joints-only block-tower baseline checkpoint (joints_only/49999), using the same 6-dataset mix with loss restricted to the first 7 joint dimensions.

Experiment

  • Objective: Train RLT encoder-decoder with joints-only action supervision on the joints-only baseline.
  • Weight init: pravsels/build_block_tower_baseline_6mix_joints_only checkpoint 49999 (joints-only baseline).
  • Total steps: 50,000 (completed)
  • Best val loss: 191.3 (step 45,000) โ€” published checkpoint
  • Final train loss: 104.8 (step 49,900)

Config

  • Config name: pi05_rlt_build_block_tower_6mix_joints_only
  • Model: Pi0RLConfig (pi0.5, action_horizon=50, rl_vla_loss_weight=0.0)
  • VLA backbone: frozen (encoder-decoder only)
  • Batch size: 36
  • Learning rate: 5e-5 cosine decay (1k warmup, 10k decay)
  • Optimizer: AdamW (gradient clip norm 1.0)
  • EMA decay: 0.999
  • Delta actions: enabled
  • Episode split: 90/10 train/val (seed=42)
  • Action space: 17D canonical (first 7 joint dims active, remaining 10 EEF dims masked)
  • joints_only: True

Dataset

6 HuggingFace datasets: villekuosmanen/build_block_tower plus dAgger_build_block_tower_1.0.0 through 1.4.0 (340 episodes total).

Checkpoint Hashes

Verify integrity with:

cd checkpoints/<step> && find params -type f | sort | xargs sha256sum | sha256sum
Step Train Loss Val Loss SHA-256
45,000 108.7 191.3 75a3d6e1504ff4646f5276f02a42376a0c38db68d951c2da8c04eb212c6b63c6

W&B

Repo Structure

assets/                        # Norm stats, per-timestep stats, episode split, valid indices
checkpoints/45000/params/      # Model weights (params only)
README.md                      # This file
TRAINING_LOG.md                # Training log

Usage

from openpi.training.config import get_config
from openpi.serving.policy_server import PolicyServer

config = get_config("pi05_rlt_build_block_tower_6mix_joints_only")
server = PolicyServer(config, checkpoint_path="checkpoints/45000/params")
Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading