Meta's text-to-video generation model that creates videos from text descriptions using multimodal learning across images and videos.
Make-a-Video from Meta generates videos from text with natural motion. It learns from unlabeled video data for improved quality.