Skip to content

Stable Diffusion ComfyUI Note 03 – How To Download SD Models

Last Updated on 2024-08-12 by Clay

When using ComfyUI to generate images, we need to leverage the capabilities of various models to ultimately form a complete workflow. In other words, these so-called various models together constitute what we call Stable Diffusion. Today, I will introduce where to download these models.


Introduction to Stable Diffusion Models

To further elaborate, Stable Diffusion can be seen as a combination of 3 + 1 models:

  1. Stable Diffusion Checkpoints (UNet Model):
  2. VAE (Variational Autoencoder) Model
  3. CLIP Model
    • It is an embedding model that maps textual descriptions into a vector space similar to corresponding images
    • Used for matching between text and images
    • Relevant research can be found at: https://openai.com/index/clip/
  4. Other Supplementary Models and Modules (Optional)
    • LoRA Models: LoRA models can be added as needed to adjust the style of the generated images
    • Supplementary Modules: These resources can be obtained from various projects on GitHub, such as awesome-comfyui etc.

But for now, let’s just focus on the Stable Diffusion Checkpoint. In Chinese, this is often translated as “checkpoint,” but you can think of it as the core model itself.

The Checkpoint itself actually has many versions in Stable Diffusion. Most of the different versions have different LoRA models available because the architectures vary, and the LoRA models must be fine-tuned based on a fixed model architecture.

Here, I temporarily recommend using the SD 1.5 version because I usually use many LoRA models that are also SD 1.5 versions. The relationship between the Checkpoint and LoRA is actually illustrated in the diagram below:

The X at the bottom can be imagined as our input. Our input X, after being processed by the model’s weight W, produces an output h, which is the original process; and after adding LoRA, it is as if our input X, in addition to passing through the model’s original weight W, also passes through LoRA’s AB weight matrices. The result is then added to the original model’s output to obtain a different h.

The inference (image generation) process remains the same, only additional information is added.

Different LoRA models are fine-tuned for different purposes and styles, which we may introduce later. But for now, let’s return to the topic of model downloads!


Model Downloads

First, let’s head to the largest platform for downloading Stable Diffusion models, Civitai. Here, you’ll find a plethora of models to download, guaranteed to keep you occupied for months! The rate of model releases is faster than you can download them!

Next, we’ll download a different Checkpoint model. Of course, you can also download LoRA models later to change the style of the generated images. Here, I recommend a model called 2D Gyaru Mix. Its art style is very clean and generates anime characters with a nice effect, without the “oily feel” that many people criticize in AI-generated images online.

After clicking into the desired model page, we can browse the model information, check if it has the architecture we want (SD 1.5), look at the author’s notes, and see if the license has any restrictions or if specific prompts are needed, etc.

Once everything is confirmed, we can download the model into the ComfyUI/models/checkpoints/ directory.

Next, we return to the ComfyUI interface in the browser and remember to hit Refresh on the right-hand side of the workspace so the system can detect the latest models. Then, we load the workflow that we previously prepared:

Then, you only need to change one thing: the ckpt_name in the Load Checkpoint card. Change it to the new model you downloaded that you want to try.

Does the effect look very different?

You can try out different Checkpoints (base models) at will. Next time, we’ll document how to use LoRA to add style and how to integrate LoRA into an existing workflow!


References


Read More

Leave a Reply