Instructions to use CoRL2026-CSI/SmolVLA-UR7e-ArrangeBlock_50epoch with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- LeRobot
How to use CoRL2026-CSI/SmolVLA-UR7e-ArrangeBlock_50epoch with LeRobot:
# See https://github.com/huggingface/lerobot?tab=readme-ov-file#installation for more details git clone https://github.com/huggingface/lerobot.git cd lerobot pip install -e .[smolvla]
# Launch finetuning on your dataset python lerobot/scripts/train.py \ --policy.path=CoRL2026-CSI/SmolVLA-UR7e-ArrangeBlock_50epoch \ --dataset.repo_id=lerobot/svla_so101_pickplace \ --batch_size=64 \ --steps=20000 \ --output_dir=outputs/train/my_smolvla \ --job_name=my_smolvla_training \ --policy.device=cuda \ --wandb.enable=true
# Run the policy using the record function python -m lerobot.record \ --robot.type=so101_follower \ --robot.port=/dev/ttyACM0 \ # <- Use your port --robot.id=my_blue_follower_arm \ # <- Use your robot id --robot.cameras="{ front: {type: opencv, index_or_path: 8, width: 640, height: 480, fps: 30}}" \ # <- Use your cameras --dataset.single_task="Grasp a lego block and put it in the bin." \ # <- Use the same task description you used in your dataset recording --dataset.repo_id=HF_USER/dataset_name \ # <- This will be the dataset name on HF Hub --dataset.episode_time_s=50 \ --dataset.num_episodes=10 \ --policy.path=CoRL2026-CSI/SmolVLA-UR7e-ArrangeBlock_50epoch - Notebooks
- Google Colab
- Kaggle
CoRL2026-CSI/SmolVLA-UR7e-ArrangeBlock_50epoch
This is a LeRobot SmolVLA policy fine-tuned from
lerobot/smolvla_base on
CoRL2026-CSI/UR7e-CaP_arrange_block_100epi_10fps.
The model is intended for UR7e ArrangeBlock manipulation experiments using RGB observations, robot proprioception, language instructions, and continuous action chunks. It is uploaded as a LeRobot policy checkpoint and should be loaded through the matching LeRobot SmolVLA implementation used for training.
Model Details
- Policy type: SmolVLA
- Base policy:
lerobot/smolvla_base - Vision-language model:
HuggingFaceTB/SmolVLM2-500M-Video-Instruct - Action chunk size:
50 - Action steps:
50 - Max state/action dims:
32/32 - Vision encoder frozen:
true - Train expert only:
true - Train state projection:
true
Fine-Tuning Setup
- Dataset:
CoRL2026-CSI/UR7e-CaP_arrange_block_100epi_10fps - Training steps:
9250 - Approx. epochs:
50.26 - Final training samples:
2368000 - Final training loss:
0.010289 - Runtime:
6.12 hours - Per-GPU batch size:
128 - Gradient accumulation steps:
1 - Number of GPUs:
2 - Effective batch size:
256 - Optimizer lr:
0.0001 - Optimizer betas:
[0.9, 0.95] - Weight decay:
1e-10 - Scheduler warmup/decay:
1000/30000 - Final decay lr:
2.5e-06 - DataLoader workers:
8 - DataLoader prefetch factor:
1
Camera Mapping
observation.images.realsense_topview->observation.images.camera2observation.images.realsense_wrist->observation.images.camera1
Image Augmentation
affine:RandomAffine{'degrees': [-5.0, 5.0], 'translate': [0.05, 0.05]}brightness:ColorJitter{'brightness': [0.8, 1.2]}contrast:ColorJitter{'contrast': [0.8, 1.2]}hue:ColorJitter{'hue': [-0.05, 0.05]}saturation:ColorJitter{'saturation': [0.5, 1.5]}sharpness:SharpnessJitter{'sharpness': [0.5, 1.5]}
Inputs
observation.images.camera1:VISUAL, shape[3, 256, 256]observation.images.camera2:VISUAL, shape[3, 256, 256]observation.images.camera3:VISUAL, shape[3, 256, 256]observation.state:STATE, shape[6]
Outputs
action:ACTION, shape[7]
Usage
Install and use the same LeRobot checkout/environment that contains the SmolVLA policy
implementation, then point policy.path to this Hub repo.
lerobot-record \
--robot.type=<your_robot> \
--dataset.repo_id=<your_eval_dataset_repo> \
--policy.path=CoRL2026-CSI/SmolVLA-UR7e-ArrangeBlock_50epoch \
--episodes=10
For local Python usage, load the policy with LeRobot's policy factory from the training checkout.
Evaluation
This upload records the offline training run metrics only. No rollout success rate is claimed here unless a separate real/sim evaluation is added later.
Final logged training metrics:
- loss:
0.010289 - grad norm:
0.101402 - learning rate:
2.5000801319183248e-06 - update time:
1.1543 s/step - dataloading time:
1.0103 s/step
Limitations and Safety
This model is a robot control policy and can produce unsafe actions if deployed on hardware without appropriate validation, workspace limits, emergency stop handling, and task-specific safety checks. Test in simulation or a constrained setup before any physical deployment.
The model is specialized to the training dataset, camera mapping, calibration, action space, and embodiment configuration. It may not transfer reliably to different robots, camera placements, object layouts, or tasks without further validation or fine-tuning.
License and Terms
The training dataset is marked apache-2.0, and the SmolVLM2 component is marked
apache-2.0. The lerobot/smolvla_base model card does not currently declare a license
field, so this fine-tuned model is conservatively marked as other. Users are responsible
for checking the applicable base model, dataset, and deployment terms before use.
Files
model.safetensors: fine-tuned policy weightsconfig.json: LeRobot SmolVLA policy configtrain_config.json: training configurationpolicy_preprocessor.jsonandpolicy_postprocessor.json: LeRobot processor pipelinespolicy_*_processor.safetensors: normalization/statistics state used by processors
- Downloads last month
- 27
Model tree for CoRL2026-CSI/SmolVLA-UR7e-ArrangeBlock_50epoch
Base model
lerobot/smolvla_base