Hello everyone! Today, I’m going to show you how to use Vid2Vid with ComfyUI to enhance your AI-generated content.
As more people seek to create diverse AI-driven content, I believe this method can help you achieve higher-quality results for your projects.
Creating AI Video with AnimateDiff & ControlNet in ComfyUI

1. Prerequisites (Custom Nodes)
To get started, make sure you have the following:
- AnimateDiff
- ControlNet
- Base Model Installation
1-1. What is AnimateDiff?
AnimateDiff extends Stable Diffusion’s single-frame image generation into multi-frame video creation. It ensures smooth transitions and motion continuity across frames, allowing users to generate AI animations easily.
Using ComfyUI’s AnimateDiff node, you can create short animated clips based on text prompts or reference images. It also works well with ControlNet and other AI-driven tools to achieve precise and creative animations.
1-2. What is ControlNet?
ControlNet provides additional control over Stable Diffusion’s output, allowing users to preserve structure, shape, and pose while generating diverse visual styles. By utilizing depth maps, canny edges, and pose estimation, ControlNet helps maintain the original composition while transforming the image into new styles.
To put it simply, ControlNet acts as a visual guide that dictates how your AI-generated content should appear.
1-3. Choosing a Base Model
Make sure to download and prepare the appropriate base model before proceeding.
For today’s tutorial, we’ll be using SD 1.5, so please follow along accordingly.
Now that we’re ready, let’s jump into the step-by-step process! 🚀
2. Installing AnimateDiff & ControlNet in ComfyUI
Before proceeding, launch ComfyUI and install both AnimateDiff and ControlNet via the built-in Manager.

Now you can start with setting up the Node Workflow.
3. Setting Up the Node Workflow
Today, I’ll guide you through a method that ensures smooth motion and detailed facial enhancements for AI-generated content.
If you’re already familiar with the setup and want to jump straight into the workflow, you can download the pre-configured node setup here:
3-1. Connecting Base Model to Output Nodes
For beginners, follow the node setup as shown in the image below.

- Load the AnimateDiff-evolved Checkpoint and select the desired base model.
- Add the sqrt_linear (AnimateDiff) node and connect it accordingly.
- Create two Clip nodes: one for Positive Prompts and one for Negative Prompts.
- Load the AnimateDiff Loader and connect it to the Checkpoint Loader.
- Insert a KSampler node, link all necessary components, and match the image resolution to the reference video.
- Add a Constrain Image for Video node to maintain consistency in dimensions.
Finally, place a Video Combine node at the end.
3-2. Integrating ControlNet for Reference Video
Now that we’ve set up the foundation, let’s add ControlNet for better structure and movement guidance.
For this tutorial, we’ll use ZDepth and OpenPose ControlNet models.

- Load the reference video using the Load Video node.
- Use the Constrain Image for Video node to match its dimensions.
- Add ZDepth and OpenPose nodes to extract structural data.
- Connect Apply ControlNet nodes and set Strength values:
- Depth → 1.0 (highest priority)
- Pose → 0.8 (secondary priority)
Ensure all components are properly linked before proceeding.
Note: When using multiple ControlNet nodes, set all secondary values below 1.0 to avoid conflicts.

3-3. Improving Facial Details
By default, AI-generated videos might result in blurry or distorted faces.
To solve this, we’ll use Lora models to refine facial details.

- Load another Checkpoint Loader and connect it to an Easy Apply LoraStack node.
- Mix two LoRA models to enhance facial accuracy.
- Use ToBasicPipe to separate the face area for reconstruction.
- Add a SAMLoader and UltraLyrics Provider to detect and refine faces.
Lastly, apply SEGSDetailer to sharpen features before re-integrating them into the final video.
3-4. Merging the Enhanced Faces
Now that we’ve refined the facial details, let’s integrate them back into the final output.

Insert an SEGS Paste node between Constrain Image for Video and Video Combine.
This ensures that our improved AI-generated faces replace the default results.
4. Generating the AI Video
Once all nodes are connected, it’s time to generate the video!
- Upload your reference video in the Load Video node.
- Set the frame extraction rate and frame generation count.
- Enter your text prompt for the AI-generated transformation.

If you’d like to add more custom Embeddings or LoRA models, feel free to integrate them into the workflow.
I tested this using a dance video from the Konami_vn account on a popular video platform.
Original video link → @konami_vn
For this example, I’ll be generating 30 frames to showcase the results.
⚠️ Make sure to set the Seed to a fixed value for consistency!
5. The Final AI Video
And there you have it! Isn’t it amazing?
This method produces results comparable to AI-generated videos you see on YouTube and social media! 🎬✨
At first glance, this workflow may seem complex, but the logic behind it is fairly simple:
1️⃣ Load a reference video
2️⃣ Extract depth & pose data using ControlNet
3️⃣ Generate AI frames with AnimateDiff
4️⃣ Refine facial details with LoRA & SEGS Detailer
5️⃣ Combine everything into a smooth video
💡 Key Takeaway:
ControlNet plays the most crucial role in achieving high-quality AI video generation!
6. Recommended GPU for AI Video
If you’re worried about needing expensive GPUs, don’t worry!
I successfully ran this process using an RTX 4070 EVO (12GB VRAM).
While higher-end GPUs like the RTX 4090 will improve speed, the 4070 still delivers excellent results.
🔗 Check out the GPU I use here:
[RTX 4070 12GB AI-Optimized Graphics Card]
🔗If you want to know about GPU Optimization for ComfyUI, Check it here
[Maximize ComfyUI performance with RTX 4090]
7. Final Words
I hope this tutorial helps you explore new possibilities in AI content creation!
If you found this guide useful, don’t forget to follow A2SET for more AI content updates.
Next time, I’ll be sharing more practical AI production techniques to help you level up your content creation game! 🚀
Until then, happy AI experimenting! 😃