Upload folder using huggingface_hub

Browse files

Files changed (2) hide show

README.md +71 -0
model_cleaned.safetensors +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,71 @@

+---
+library_name: custom
+tags:
+- robotics
+- diffusion
+- mixture-of-experts
+- multi-modal
+license: mit
+datasets:
+- CALVIN
+languages:
+- en
+pipeline_tag: robotics
+---
+# MoDE (Mixture of Denoising Experts) Diffusion Policy
+## Model Description
+<div style="text-align: center">
+    <img src="MoDE_Figure_1.png" width="800px"/>
+</div>
+- [Github Link](https://github.com/intuitive-robots/MoDE_Diffusion_Policy)
+- [Project Page](https://mbreuss.github.io/MoDE_Diffusion_Policy/)
+This model implements a Mixture of Diffusion Experts architecture for robotic manipulation, combining transformer-based backbone with noise-only expert routing. For faster inference, we can precache the chosen expert for each timestep to reduce computation time.
+The model has been pretrained on a subset of OXE for 300k steps and finetuned for downstream tasks on the CALVIN/LIBERO dataset.
+## Model Details
+### Architecture
+- **Base Architecture**: MoDE with custom Mixture of Experts Transformer
+- **Vision Encoder**: ResNet-50 with FiLM conditioning finetuned from ImageNet
+- **EMA**: Enabled
+- **Action Window Size**: 10
+- **Sampling Steps**: 5 (optimal for performance)
+- **Sampler Type**: DDIM
+### Input/Output Specifications
+#### Inputs
+- RGB Static Camera: `(B, T, 3, H, W)` tensor
+- RGB Gripper Camera: `(B, T, 3, H, W)` tensor
+- Language Instructions: Text strings
+#### Outputs
+- Action Space: `(B, T, 7)` tensor representing delta EEF actions
+## Usage
+```python
+obs = {
+    "rgb_obs": {
+        "rgb_static": static_image,
+        "rgb_gripper": gripper_image
+    }
+}
+goal = {"lang_text": "pick up the blue cube"}
+action = model.step(obs, goal)
+```
+## Training Details
+### Configuration
+- **Optimizer**: AdamW
+- **Learning Rate**: 0.0001
+- **Weight Decay**: 0.05
+## License
+This model is released under the MIT license.

model_cleaned.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bcf93362ced811101ff755dac7a9e85267cf76f933f4ad847edecac7be71d9a3
+size 3317019856