Member-only story

Five Open-Source Models for Video Generation

Leveraging OS for video generation

Mayur Jain
4 min readFeb 15, 2025
Photo by Seth Doyle on Unsplash

Hunyuan Video

It is a systematic framework for large video generative models. The model exhibits a performance that is comparable to leading closed-source models. This significant performance boost is led by data curation, image-video joint model training, and an efficient infrastructure to facilitate large-scale model training and inference.

According to professional human evaluation results, HunyuanVideo outperforms previous state-of-the-art models, including Runway Gen-3, Luma 1.6, and 3 top-performing Chinese video generative models.

Hunyuan Video Architecture

HunyuanVideo is trained on a spatial-temporally compressed latent space, which is compressed through a Causal 3D VAE. Text prompts are encoded using a large language model, and used as the conditions. Taking Gaussian noise and the conditions as input, our generative model produces an…

--

--

No responses yet