Hello everyone~! This is A2SET, where we share information about AI that you might be curious about.
Today, I’d like to explain how to train LoRA, which I mentioned in a previous post. There are so many methods, and depending on the hardware you have, some may find it easy or difficult. Today, I will explain based on the method we are testing.
What is LoRA?
First, let me give a brief explanation of LoRA! LoRA (Low-Rank Adaptation) is a training technique for fine-tuning the Stable Diffusion model.
–> Click here to read more about LoRA
For those who want to know more about LoRA, please refer to the LoRA post we shared previously. It will be very helpful in understanding today’s content.
1. The Need for a Trained Model
Many of you will want to create LoRA to produce impressive results or to generate your own unique data. For those of you who have this in mind, many platforms are already generating AI images using your facial data.
- Media.io AI
- Unique real-time face generator
- Night Cafe AI
- This person Does Not Exist
- BoredHumans.com
- Fotor
While such platforms can yield impressive results, I believe there may have been times when you were slightly disappointed because they do not provide the raw data your facial data was trained on, but instead deliver already processed images. Today, we will proceed by creating a LoRA model that is trained on your own face, and we will try to extract various results tailored to your personal taste.
2. What We Need
To create a LoRA model, you will need the following materials:
- Kohya_SS
- At least 14 photos of the data you want to train
- Model data that will serve as the base for training
- Regularization Data
If these four items are ready, you can proceed with the training immediately. I will explain each one and then we can start the training right away.
1) Kohya_SS
Kohya-SS is a Python library for fine-tuning the stable diffusion model, which is friendly for consumer-grade GPUs and compatible with AUTOMATIC1111’s web-ui. In other words, you can think of it as a dedicated training program capable of training LoRA data.
This program has quite a number of settings, and depending on the performance of the hardware you have, there may be aspects that can be trained comfortably or aspects that may be overwhelming. It’s good to check the specs of your computer and proceed accordingly. If the performance of the computer you can use is not very good, you can also perform training in the cloud through Google Colab, so you don’t need to worry too much.
2) Photos of the Data You Wish to Train (At Least 14)
You need photos of the data you want to create, taken from various angles against a simple background to come out well. One point to note here is that if you are training a LoRA model for SDXL (WebUI Stable Diffusion XL), the image size should be set to 1024×1024. If you want to train a LoRA model for SD1 or SD1.5, not SDXL, it is also acceptable to specify the size as 512×512.
3) Model Data to Serve as the Training Base
This is the model you absolutely need to train based on the version of the model you desire. For instance, if you wish to create SDXL LoRA data, you must prepare the SDXL base model recommended in the previous post for compatibility. Likewise, if it’s SD1, you need to prepare the SD1 base model, and for SD1.5, the SD1.5 base model must be prepared to fully train your model.
4) Regularization Data(Optional)
Regularization data can be considered essential reference data, especially when creating fact-based data for AI training. For example, when training the face of a man named A, it is possible to train with just A’s photos, but if you want to ensure that AI can represent A’s likeness accurately in various situations based on its advanced learning capabilities, it would be beneficial to train the AI with numerous other men’s facial data as references. Collecting various real-life reference photos such as men’s faces under dim lighting, men’s faces against bright backgrounds, and faces of male models will greatly aid in the creation of your desired LoRA data later on. It is recommended to collect copyright-free regularization data, and to organize these images in either 512×512 or 1024×1024 size.
3. Installing Kohya (Windows Version)
First, press the Windows button.
Search for PowerShell, and when Windows PowerShell appears, run it as an administrator. Then type in Set-ExecutionPolicy Unrestricted and press Enter. When asked if you want to change the execution policy, press A to proceed.
After that, if you type Set-ExecutionPolicy and press Enter, and if it returns Unrestricted, then you have successfully set the execution policy.
Next, navigate to the directory where you want to install kohya_ss. You can do this by right-clicking in the folder where you want to install it and selecting “Open in Terminal,” or by opening a PowerShell window directly.
git clone https://github.com/bmaltais/kohya_ss.git
cd kohya_ss
python -m venv venv
.\venv\Scripts\activate
pip install torch==1.12.1+cu116 torchvision==0.13.1+cu116 --extra-index-url https://download.pytorch.org/whl/cu116
pip install --use-pep517 --upgrade -r requirements.txt
pip install -U -I --no-deps https://github.com/C43H66N12O12S2/stable-diffusion-webui/releases/download/f/xformers-0.0.14.dev0-cp310-cp310-win_amd64.whl
cp .\bitsandbytes_windows*.dll .\venv\Lib\site-packages\bitsandbytes\
cp .\bitsandbytes_windows\cextension.py .\venv\Lib\site-packages\bitsandbytes\cextension.py
cp .\bitsandbytes_windows\main.py .\venv\Lib\site-packages\bitsandbytes\cuda_setup\main.py
After you copy the code content and paste it as is, the process will automatically proceed. You will need to wait for quite some time, but after that, it should reach this state.
Here, after pressing Enter, type accelerate config and then hit Enter again, and you will see the following messages.
In which compute environment are you running?
This Machine
Which type of machine are you using?
No distributed training
Do you want to run your training on CPU only (even if a GPU is available)? [yes/NO]:NO
Do you wish to optimize your script with torch dynamo?[yes/NO]:NO
Do you want to use DeepSpeed? [yes/NO]: NO
What GPU(s) (by id) should be used for training on this machine as a comma-seperated list? [all]:all
Do you wish to use FP16 or BF16 (mixed precision)?
fp16
If you respond in the above manner, a folder named Kohya_SS will be created in the directory you specified earlier, and it will contain all the necessary subfolders.
** If you are using a graphics card that starts with Nvidia 30 or 40 series, please follow the method below!
–> cuDNN Archive Download Link
Enter the link provided, download the file and unzip it, then rename the folder to cudnn_windows and place it into the Kohya_ss\Kohya_ss directory. Next, open PowerShell and continue with the setup.
cd C:\kohya_ss\kohya_ss
./venv/Scripts/activate
python .\tools\cudann_1.8_install.py
If you follow the instructions as given, you will see an output similar to the image provided, which indicates that for graphics cards series 30 and 40, the speed of LoRA training increases.
Now that you have installed Kohya_ss, when you run the gui.bat file in the Kohya_ss folder and access the URL that appears in the familiar cmd window, you can see a UI similar to WebUI Stable Diffusion running.
4. Specifying the Model File Path and Converting Training Data
Step 1 : Set Models
Now let’s take a look at the executed Kohya UI and guide you through the settings needed to create a LoRA model.
Enter the LoRA tab, which is the second tab from the top, then click on the Training tab below and specify the settings for the Source Model. Since we are going to train the LoRA based on the SDXL model base, set all the boxes marked in red identically as shown above.
Step 2 : Captioning Images
The next step is to enter the Utilities tab, which is in the same row as the LoRA tab, and then click on Captioning below, followed by clicking on the wd14 Captioning tab on the lower right.
To proceed as shown in the red box, you need to set the necessary paths. First, for the Image folder to caption, you simply need to set the folder path that contains at least 14 images you want to train.
The .txt on the right indicates that through the process of captioning, a notepad is automatically created and prompts describing each image are generated. Please take this into consideration. Leave all other settings as they are, and also leave the Batch Size at the bottom as is, and then click the Caption Images button.
After a bit of waiting, if you go into the image folder, you will see a TXT file created next to each image. This is where things get important: you will customize the text files containing prompts for each image. Of course, you can leave them as they are, but since you’re going through the effort of training the model, it’s better to do it properly.
You will need to invest some labor time. Even though it may be tedious, please open each text file describing the images and write a more detailed description along with the Trigger Prompt you want.
For example, in my case, I would write BMwarrior (Barbarian-Man-warrior) as a Trigger Prompt at the very front and then slightly change the text content before saving it.
Step 3 : Prepare for Data
Once you have roughly prepared everything, it’s time to organize the folder paths.
Go back to the LoRA tab and enter the Tools tab, where you will find a tab named Deprecated on the right. Click on this tab and fill in the content displayed. In the Instance Prompt, enter BMwarrior as mentioned above, and in the Class Prompt, enter man. Then, for the Training images path, select the folder that contains both the images and the captioned texts.
The Repeats option on the right indicates how many times you want to repeat the training for the images. Naturally, the higher the number, the better the training outcome. However, because it can take too long on a regular computer, we will leave it at the number 40 for now.
The Regularization images section below will utilize a variety of real-life-based data of different men’s faces as a reference, which was prepared earlier for the training of a male face. In other words, the images to be captioned and trained above will rely on this Regularization data to facilitate diverse learning experiences. It is beneficial to secure as much data as possible since it will aid in the training process. The Repeat number for this part will be left at 1.
Finally, the Destination training section is where you specify the final file path so that all the contents set above are organized within a single folder. You can create and designate a folder wherever you prefer for the final organization.
Let’s check if all the folders have been properly designated.
By re-entering the LoRA tab and going into the Training tab, then pressing Folders, you will see the paths all set up as shown below. In this window, you now need to name the model that you will train. I will name it BMwarrior_v1, consistent with what was mentioned earlier. All the folder and path preparations, as well as data readiness, are now complete. Finally, I will enter the Parameter tab to set the configuration values and then begin training.
Step 4 : Train Setting
Now we are at the very last step! Have you been able to follow along so far? We need to enter the Parameter tab and make a few settings. The UI window is quite long, so I will only summarize the essential elements to check along with the image in text form.
Firstly, you can proceed with all settings identical to those shown in the image above. Of course, it’s not that the above settings are 100% correct, but they are recommended settings that are suitable for training a decent LoRA SDXL model. I will explain each and every setting in the Parameter tab in detail in the future. Today, since we need to test, let’s proceed quickly~!
Please copy and paste the following text into the “Optimizer extra arguments” section, just like in the image:
scale_parameter=False relative_step=False warmup_init=False
These settings help to mitigate the risk of overloading during training, so please just take them as a reference. Now, change the value of Resolution to 1024×1024, which is exclusive for SDXL, and proceed with all other settings exactly as shown in the image.
Finally, expand the “Advanced Configuration” tab just below, select “Gradient checkpointing,” and then click the “Start Training” button at the bottom.
Step 5 : Time of patience
The training process will now commence, and it may take some time to complete. Please note that the training duration can vary significantly depending on the batch size and the number of epochs you have set at the beginning of the Training Setting. Keep this in mind as your training progresses.
After a long wait, the LoRA model has finally been created! Let’s give a round of applause to our computers for their hard work… haha. Now that we have this model, it’s time to put it to use, right? In the following section, I will guide you on how to utilize the LoRA model, so please follow along to proceed!
We’ve gone through the process of training a LoRA model. How was it?
It’s quite complex and difficult, right? However, as you can see, if you have the preparation mentioned above, anyone can quickly take on this challenge, so don’t be afraid and give it a try right away! The very idea of being able to create your own AI data is incredibly attractive. Next time, I’ll go into more detail about the settings in the Parameter that I couldn’t explain earlier. Finish off your day well, and I’ll greet you again with fresh news!