None defined yet.
Inferring Compositional 4D Scenes without Ever Seeing One
TRAVL: A Recipe for Making Video-Language Models Better Judges of Physics Implausibility