skaramcheti
commited on
Commit
•
423cc6c
1
Parent(s):
742ce85
Update README.md
Browse files
README.md
CHANGED
@@ -14,8 +14,7 @@ pipeline_tag: image-text-to-text
|
|
14 |
OpenVLA 7B (`openvla-7b`) is an open vision-language-action model trained on 970K robot manipulation episodes from the [Open X-Embodiment](https://robotics-transformer-x.github.io/) dataset.
|
15 |
The model takes language instructions and camera images as input and generates robot actions. It supports controlling multiple robots out-of-the-box, and can be quickly adapted for new robot domains via (parameter-efficient) fine-tuning.
|
16 |
|
17 |
-
All OpenVLA checkpoints
|
18 |
-
the same license.
|
19 |
|
20 |
For full details of our model and pretraining procedure please read [our paper](https://openvla.github.io/) and see [our project page](https://openvla.github.io/).
|
21 |
|
@@ -41,7 +40,7 @@ per-dataset basis. See [our repository](https://github.com/openvla/openvla) for
|
|
41 |
|
42 |
OpenVLA models can be used zero-shot to control robots for specific combinations of embodiments and domains seen in the Open-X pretraining mixture (e.g., for
|
43 |
[BridgeV2 environments with a Widow-X robot](https://rail-berkeley.github.io/bridgedata/)). They can also be efficiently *fine-tuned* for new tasks and robot setups
|
44 |
-
given minimal demonstration data; [
|
45 |
|
46 |
**Out-of-Scope:** OpenVLA models do not zero-shot generalize to new (unseen) robot embodiments, or setups that are not represented in the pretraining mix; in these cases,
|
47 |
we suggest collecting a dataset of demonstrations on the desired setup, and fine-tuning OpenVLA models instead.
|
|
|
14 |
OpenVLA 7B (`openvla-7b`) is an open vision-language-action model trained on 970K robot manipulation episodes from the [Open X-Embodiment](https://robotics-transformer-x.github.io/) dataset.
|
15 |
The model takes language instructions and camera images as input and generates robot actions. It supports controlling multiple robots out-of-the-box, and can be quickly adapted for new robot domains via (parameter-efficient) fine-tuning.
|
16 |
|
17 |
+
All OpenVLA checkpoints, as well as our [training codebase](https://github.com/openvla/openvla) are released under an MIT License.
|
|
|
18 |
|
19 |
For full details of our model and pretraining procedure please read [our paper](https://openvla.github.io/) and see [our project page](https://openvla.github.io/).
|
20 |
|
|
|
40 |
|
41 |
OpenVLA models can be used zero-shot to control robots for specific combinations of embodiments and domains seen in the Open-X pretraining mixture (e.g., for
|
42 |
[BridgeV2 environments with a Widow-X robot](https://rail-berkeley.github.io/bridgedata/)). They can also be efficiently *fine-tuned* for new tasks and robot setups
|
43 |
+
given minimal demonstration data; [see here](https://github.com/openvla/openvla/blob/main/scripts/finetune.py).
|
44 |
|
45 |
**Out-of-Scope:** OpenVLA models do not zero-shot generalize to new (unseen) robot embodiments, or setups that are not represented in the pretraining mix; in these cases,
|
46 |
we suggest collecting a dataset of demonstrations on the desired setup, and fine-tuning OpenVLA models instead.
|