Hello everyone~! Did you have a pleasant weekend?
Today, as we start a new week, I’ll explain how to create AI animations using an interesting technique, ComfyUI.
What is SVD?
SVD is a model of Stable Diffusion used in image-to-video (img2vid) conversions. As I explained in a previous post about animation, it is a model that helps create natural videos based on the first frame, which is the image you want. For example, if you start with a single picture of a rocket launching, with smoke coming out of the bottom, the SVD model can be used in the Img2Vid process to create a video that looks like the rocket is flying upwards.
1. Preparation
- Install ComfyUI
- Install SVD model and reflect it in the folder
- Update ComfyUI
If you already have Comfy UI installed, you can proceed immediately after checking for extension installations and downloading the required SVD model. However, for those who have not yet installed Comfy UI, I will explain step by step in a simple manner, so please follow the instructions below!
1) Install ComfyUI
ComfyUI, unlike Automatic1111 of Stable Diffusion, refers to a UI that appears in a node-like form. Today’s tutorial is about ComfyUI, so please make sure to install it. You can install the ComfyUI program through the post in the link below. If you have installed it, please follow the next steps.
Once you have ComfyUI installed, you need to install something called ComfyUI Manager.
Move to the link below, copy the code from the green button, then go to the installed ComfyUI>custom_nodes folder.
Open the terminal, type ‘git clone’, paste the copied code URL, and press enter.
2) Install Models
Once the installation of the ComfyUI Manager extension is complete, go to the links provided below and download both SVD models.
Then, place them in the ComfyUI> Models > Checkpoints folder.
3) Update ComfyUI
Now, all preparations are complete. Click on run_nvidia_gpu.bat(For Graphic Card User) to launch ComfyUI. To check if the downloaded extensions are fully installed, please click the Manager button on the bottom right. When you press the Manager button, a window with several buttons will appear. Among them, press ‘Update All’ to update ComfyUI to the latest version.
2. Settings for Img2Vid
Now, all that remains is to set up the image for video conversion. It’s important to note that the subject of the image should be clear and not too small, as the image will be automatically converted into a video. First, I’ll inform you of the specific image size, and based on this, please crop the image you want to animate.
Image size: 1024px X 576px
After cropping and saving your image to the specified size, let’s create the necessary nodes in ComfyUI. We need a total of 10 nodes. I will list them below, and you can create each one by double-clicking on the ComfyUI screen and searching for them.
- Image Only Checkpoint Loader (Img2vid model)
- Load Image
- SVD_img2vid_Conditioning
- VideoLinearCFGGuidance
- FreeU_V2
- KSampler
- VAE Decode
- Seed (rgthree)
- RIFE VFI
- Video Combine
Once you have created all the nodes listed above, I will show you how they are connected. Please proceed as shown in the following image.
1) Image Only Checkpoint Loader (Img2vid model)
The Image Only Checkpoint Loader serves a similar role to the Base Model used in Stable Diffusion or ComfyUI. Here, you need to select one of the SVD models downloaded from the link previously shared.
2) Load Image
The Load Image node is for the 1024×576 size image that we cropped and saved earlier. This image will serve as the first frame of the video we want to create.
3) SVD_img2vid_Conditioning
The SVD_img2vid_Conditioning node is crucial in our process. Here, you can adjust the frame count, FPS, and the extent of motion for the video you want to create. A tip: The augmentation_level value reflects the overall movement of the image’s background. If you want to show movement of just the main object, keep this value minimal, like 0.02.
4) VideoLinearCFGGuidance
This node is akin to the familiar CFG value, setting how much freedom is given in the process.
5) FreeU_V2
FreeU_V2 allows for more precise settings regarding the freedom (CFG value) and image quality. For a better understanding, check out the link below.
–> FreeU : Free Lunch in Diffusion U-Net
6) KSampler
KSampler is a necessary node for complete generation in ComfyUI. Connect all the nodes mentioned before to KSampler and set your desired values. If you have experience with Stable Diffusion or ComfyUI, this should be familiar. For newcomers, set it up as shown in the image above.
7) VAE Decode
VAE Decode is essential in our creation process. It helps KSampler’s settings reach full generation. Connect KSampler’s Latent to Sample and link it to the VAE section of the Loader you set up earlier.
8) Seed (rgthree)
While some may be familiar with Seed values, in KSampler, the Seed value does not accept -1 (Random). So, if you want to give it a freedom value of -1, you need to specify it in a separate window as shown in the image above.
9) RIFE VFI
RIFE, or Real-time Intermediate Flow Estimation, is an algorithm for Video Frame Interpolation (VFI). Many flow-based VFI methods estimate bi-directional optical flows, scale, and reverse them to approximate intermediate flows, which can lead to artifacts on motion boundaries. –> Source: Paperswithcode
10)Video Combine
Finally, Video Combine is the node where you can preview the generated video and save it separately.
For those who find this process too difficult, you can download the .json file below and simply use it by pressing the Load button in ComfyUI.
3. Generated Animation
After connecting these nodes as required and initiating the generation process, the image you selected will be turned into a short animated video. I tested this using a photo depicting two rockets flying through the sky. Shall we look at the result together?
Isn’t it amazing? It’s similar to the results I posted before using StableDiffusion’s AnimateDiff, but the fact that it interprets and expresses the provided image without any prompt commands is astounding. Of course, in my case, perhaps because the subject didn’t occupy a large area, there’s a bit of a wobbly feel to the background. But please know that this can be sufficiently compensated by changing a few settings!
It might seem quite complicated, but try following the workflow I’ve provided and create your own to produce even more fantastic results! Stay tuned for my next post where I’ll explain more new features!
Excellent post. I was checking constantly this blog and I am impressed!
Very useful information specifically the remaining section :
) I take care of such information a lot. I was seeking this certain information for
a very long time. Thank you and good luck.
My family always say that I am wasting my time
here at net, except I know I am getting familiarity all the time by reading thes nice
articles.