ComfyUI Integrates NVIDIA Cosmos for Efficient AI Video Generation

GigaNectar Team

Updated on:

3D AI rendered video of a busy street. Photo Source: Comfy UI

ComfyUI, a popular tool for creating AI-generated content, now lets you turn text descriptions and images into videos using NVIDIA’s new Cosmos technology. Think of it like having a video production studio that creates videos based on your written descriptions or starting images.

This new feature works best with NVIDIA graphics cards that have 24GB of memory, but it can also run on smaller 12GB cards thanks to ComfyUI’s smart memory management. It’s like having a powerful car engine that can adapt to run efficiently on regular fuel when premium isn’t available.

NVIDIA Cosmos is remarkably efficient at handling video data. It can create a high-quality video (1280×704 resolution) with 121 frames on computers with less powerful graphics cards. To put this in perspective, it uses about 50 times less memory than similar tools like Hunyuan video.

The system works in two main ways. You can either write a detailed description of the video you want to create, or you can start with an existing image that the system will bring to life. The system can generate videos from either the first or last frame, or create transitions between two images.

Content creators can use this tool to create videos without the need for traditional video production methods. The technology also includes a new way of processing videos called “res_multistep,” which is now available in ComfyUI for use with all supported models, including Hunyuan video.

There are some important things to know before using this tool. Creating a video takes time – even with a top-end NVIDIA RTX 4090 graphics card, a single video takes over 10 minutes to make. The system also works best with specific settings: videos need to be exactly 121 frames long, and the smallest possible size is 704×704 pixels.

You also need to be detailed when describing what you want. Brief prompts won’t work well. Instead, you need to write longer, more descriptive sentences to get better results.


Similar Posts


This combination of ComfyUI and NVIDIA Cosmos shows how AI video creation is becoming more accessible and powerful. While it still has some limitations, it represents a significant step forward in making professional video creation tools available to more people.

For those interested in trying it out, ComfyUI provides example workflows that help you get started with both text-to-video and image-to-video creation. These workflows are available in JSON format and guide you through the technical setup process.

Leave a comment