释放双眼，带上耳机，听听看~！

本文介绍了如何使用diffusers库进行ControlNet图像重绘，并简单对比了使用diffusers库和使用stable-diffusion-webui的不同。

在Automatic1111的stable-diffusion-webui中，我们可以通过img2img中的inpaint功能，结合ControlNet插件，实现ControlNet控制的图像重绘。并且我们也可以将这一流程通过API参数实现调用，从而将这一套代码在云端进行可自动扩缩容的部署，参考：

最近研究了另一种实现ControlNet inpaint的方式，那就是使用diffusers库，这是huggingface官方推出的用于AI画图的一个库，其中也对stable diffusion和ControlNet进行了集成，我们可以以更简单的方式进行调用。

这篇文章就来介绍一下如何使用diffusers库进行controlnet inpaint，最后简单对比一下使用diffusers库和使用stable-diffusion-webui的不同。

安装依赖

只需要通过pip安装下面的依赖就行：

torch==2.0.1
diffusers
transformers
omegaconf
accelerate
opencv-python
xformers

加载ControlNet模型和stable diffusion模型

首先加载Controlnet模型，以加载canny模型为例：

from diffusers import ControlNetModel
from diffusers import UniPCMultistepScheduler

controlnet_canny = ControlNetModel.from_pretrained(  
    "lllyasviel/control_v11p_sd15_inpaint",  
    # torch_dtype=torch.float16,  
    cache_dir=“./cache”,  
    local_files_only=True,  
    # resume_download=True,  
)

diffusers使用的模型虽然也是safetensors，但是加载时需要的文件和webui不太一样，所以我们需要到huggingface去获取现成的转换好的模型，可以去 huggingface.co/models 进行搜索，常用的模型基本上都有现成的。

上面的代码中，ControlNet使用的是canny的inpaint模型，使用ControlNetModel.from_pretrained进行加载，加载的参数中，如果使用的是Mac，需要注释掉torch_dtype参数，cache_dir指定了下载的模型的本地保存路径，resume_download可以指定是否继续未完成的下载，local_files_only适用于之前已经在本地下载了模型了，之后无需重新下载。更多参数可以参考：huggingface.co/docs/diffus…

接下来，加载stable diffusion模型，并且将上面加载的controlnet模型放到一个pipeline中：

from diffusers import StableDiffusionControlNetInpaintPipeline

canny_pipe = StableDiffusionControlNetInpaintPipeline.from_pretrained(  
    "Uminosachi/realisticVisionV51_v51VAE-inpainting",  
    controlnet=controlnet_canny,  
    # torch_dtype=torch.float16,  
    cache_dir="./cache",  
    local_files_only=True,  
    safety_checker=None,  
    use_safetensors=True,  
    # resume_download=True,  
).to("cpu")  
canny_pipe.scheduler = UniPCMultistepScheduler.from_config(canny_pipe.scheduler.config)

因为我们准备进行controlnet重绘任务，所以使用的是diffusers中的StableDiffusionControlNetInpaintPipeline。要加载的模型也是去huggingface上搜现成的就行，contronet参数指定了要放到pipeline中的controlnet模型，最后to('cpu')的部分，如果电脑上装了cuda的话，可以改成to('cuda')。最后还指定了一个scheduler，这个应该是类似于webui中的sampler的概念，目前比较推荐的应该是UniPCMultistepScheduler

图像预处理

对于canny的img2img，我们首先要在代码中对输入图片作预处理得到图片的线条轮廓，直接使用opencv-python库就行了：

import numpy as np
from PIL import Image, ImageOps
import cv2

def process_image(image, structure, low_threshold=100, high_threshold=200):  
    if structure == 'canny':  
        input_image = canny_preprocessor(image, low_threshold, high_threshold) 
    return input_image  
  
def canny_preprocessor(image, low_threshold, high_threshold):  
    # Convert to numpy  
    image = np.array(image)  
    image = cv2.Canny(image, low_threshold, high_threshold)  
    image = image[:, :, None]  
    image = np.concatenate([image, image, image], axis=2)  
    canny_image = Image.fromarray(image)  
    return canny_image
    
input_image = Image.open("111.jpeg")
canny_image = process_image(input_image, "canny", low_threshold=100, high_threshold=200)

除了canny，更多的预处理方式可以参考：github.com/anotherjess…

运行diffuser

接下来我们就可以开始调用模型进行img2img了：

import torch
import os

@torch.inference_mode()  
def predict(self):  
    pipe = canny_pipe
    # 如果安装了xformers，可以使用下面这一行代码，否则就注释掉
    # pipe.enable_xformers_memory_efficient_attention()  

    seed = int.from_bytes(os.urandom(2), "big")  
    print(f"Using seed: {seed}")  

    mask_image = Image.open("mask.png")
    prompt = "RAW photo, 1 girl, 8k uhd, dslr, soft lighting, high quality, film grain, Fujifilm XT3, on the snow"  
    negative_prompt = "(deformed iris, deformed pupils, semi-realistic, cgi, 3d, render, sketch, cartoon, drawing, anime, mutated hands and fingers:1.4), (deformed, distorted, disfigured:1.3), poorly drawn, bad anatomy, wrong anatomy, extra limb, missing limb, floating limbs, disconnected limbs, mutation, mutated, ugly, disgusting, amputation"  
    
    outputs = pipe(  
        prompt,  
        image=input_image,  
        mask_image=mask_image,  
        control_image=canny_image,  
        height=512,  
        width=512,  
        num_inference_steps=20,  
        guidance_scale=8.0,  
        strength=0.75,  
        negative_prompt=negative_prompt,  
        num_images_per_prompt=1,  
    )  
    for i, sample in enumerate(outputs.images):  
        output_path = f"out-{i}.png"  
        sample.save(output_path)  
    return

首先，我们随机生成了一个seed，然后基本上就可以直接进行模型调用了。最后使用outputs.images获取生成的所有图片，这是一个list[PIL.Image]，所以我们直接调用.save()方法就能保存了。

因为是进行的controlnet inpaint任务，所以我们一共使用了三张图片：input_image是原始输入图片，mask_image是蒙版图片，control_image就是我们canny预处理之后的图片。

其他参数基本上都很好理解，所有可以选择的参数可以参考：huggingface.co/docs/diffus…__

总结

diffusers相比webui的好处：

安装简单，调用简单
占用空间小，在云端部署docker镜像的话冷启动相对会快一些
速度快一些

缺点：

支持的参数和webui有些差别，比如webui中的DPM++ SDE Karras的sampler，diffusers里就暂时还没有比较好的替代。所以效果可能也就一般

本网站的内容主要来自互联网上的各种资源，仅供参考和信息分享之用，不代表本网站拥有相关版权或知识产权。如您认为内容侵犯您的权益，请联系我们，我们将尽快采取行动，包括删除或更正。

{{userData.name}}已认证

使用diffusers库进行ControlNet图像重绘

安装依赖

加载ControlNet模型和stable diffusion模型

图像预处理

运行diffuser

总结

机器学习初学者的开发环境搭建指南

基于深度学习的医学图像分割与病变识别

GeoSpy.ai

Globe Explorer

即梦Dreamina

Luma Dream Machine

Motionshop

StoryDiffusion

归档

{{userData.name}}已认证

安装依赖

加载ControlNet模型和stable diffusion模型

图像预处理

运行diffuser

总结

机器学习初学者的开发环境搭建指南

基于深度学习的医学图像分割与病变识别

Stable Diffusion实验室最新模型ControlNet的重新上色能力

Stable Diffusiion 的基础能力：ControlNet 之画风融合

ControlNet更新：AI绘画细节控制大师迎来重磅更新

Midjourney 换脸大法：InsightFace 教程