Generate depth maps from images
Summarize videos to shorter clips
Generate depth maps for video frames