Skip to content

Latest commit

 

History

History
21 lines (15 loc) · 1.11 KB

File metadata and controls

21 lines (15 loc) · 1.11 KB

This dataset accompanies the paper MaskViT: Masked Visual Pre-Training for Video Prediction.

The raw data before preprocessing can be found here: https://drive.google.com/file/d/1olPNo1p-XIcLcwbifC0Z71pQijP7eY6h/view?usp=drive_link.

At a high level, this is "RoboNet-style" data in that it is random interaction data collected by a robot arm and a bin of objects, which are swapped out periodically. The objects mostly consist of soft stuffed toys and plastic objects. The robot arm is controlled by a random policy with the autograsp primitive enabled. It is collected using the Visual Foresight codebase.

The dataset is collected using a single robot, and was collected in two parts of roughly equal size. These two parts have different camera configurations and slightly different object distributions.

The dataset contains 9109 train episodes and 91 validation episodes. Each episode contains 30 steps.

Example trajectories:

Example images Example images Example images