Skip to content

Instantly share code, notes, and snippets.

@averad
Forked from harishanand95/Stable_Diffusion.md
Last active March 6, 2024 17:05
Show Gist options
  • Star 63 You must be signed in to star a gist
  • Fork 13 You must be signed in to fork a gist
  • Save averad/256c507baa3dcc9464203dc14610d674 to your computer and use it in GitHub Desktop.
Save averad/256c507baa3dcc9464203dc14610d674 to your computer and use it in GitHub Desktop.
Stable Diffusion on AMD GPUs on Windows using DirectML

🤗 Stable Diffusion for AMD GPUs on Windows using DirectML

Requirements

Installation

Create a Folder to Store Stable Diffusion Related Files

  • Open File Explorer and navigate to your prefered storage location.
  • Create a new folder named "Stable Diffusion" and open it.
  • In the navigation bar, in file explorer, highlight the folder path and type cmd and press enter.

Install 🤗 diffusers

The following steps creates a virtual environment (using venv) named sd_env (in the folder you have the cmd window opened to). Then it installs diffusers (latest from main branch), transformers, onnxruntime, onnx, onnxruntime-directml and protobuf:

pip install virtualenv
python -m venv sd_env
.\sd_env\Scripts\activate
python -m pip install --upgrade pip
pip install git+https://github.com/huggingface/diffusers.git
pip install git+https://github.com/huggingface/transformers.git
pip install onnxruntime onnx torch ftfy spacy scipy
pip install onnxruntime-directml --force-reinstall
pip install protobuf==3.20.1

To exit the virtual environment, close the command prompt. To start the virtual environment go to the scripts folder in sd_env and open a command prompt. Type activate and the virtual environment will activate.

Download the Stable Diffusion ONNX model

red-stop-icon

You will need to go to: https://huggingface.co/runwayml/stable-diffusion-v1-5 and https://huggingface.co/runwayml/stable-diffusion-inpainting. Review and accept the usage/download agreements before completing the following steps.

  • stable-diffusion-v1-5 uses 5.10 GB
  • stable-diffusion-inpainting uses 5.10 GB

If your model folders are larger, open stable_diffusion_onnx and stable_diffusion_onnx_inpainting and delete the .git folders

git clone https://huggingface.co/runwayml/stable-diffusion-v1-5 --branch onnx --single-branch stable_diffusion_onnx
git clone https://huggingface.co/runwayml/stable-diffusion-inpainting --branch onnx --single-branch stable_diffusion_onnx_inpainting

Enter in your HuggingFace credentials and the download will start. Once complete, you are ready to start using Stable Diffusion

Scripts / Examples

Copy one of the examples below and save it as a .py file. Then you type "python name_of_the_file.py" in a cmd window.

Stable Diffusion Txt 2 Img on AMD GPUs

Here is an example python code for the Onnx Stable Diffusion Pipeline using huggingface diffusers.

from diffusers import OnnxStableDiffusionPipeline
height=512
width=512
num_inference_steps=50
guidance_scale=7.5
prompt = "a photo of an astronaut riding a horse on mars"
negative_prompt="bad hands, blurry"
pipe = OnnxStableDiffusionPipeline.from_pretrained("./stable_diffusion_onnx", provider="DmlExecutionProvider", safety_checker=None)
image = pipe(prompt, height, width, num_inference_steps, guidance_scale, negative_prompt).images[0] 
image.save("astronaut_rides_horse.png")

image

Stable Diffusion Img 2 Img on AMD GPUs

Here is an example python code for Onnx Stable Diffusion Img2Img Pipeline using huggingface diffusers.

import time
import torch
from PIL import Image
from diffusers import OnnxStableDiffusionImg2ImgPipeline

init_image = Image.open("test.png")
prompt = "A fantasy landscape, trending on artstation"

pipe = OnnxStableDiffusionImg2ImgPipeline.from_pretrained("./stable_diffusion_onnx", provider="DmlExecutionProvider", revision="onnx", safety_checker=None)
image = pipe(prompt=prompt, init_image=init_image, strength=0.75, guidance_scale=7.5).images[0] 
image.save("test-output.png")

Stable Diffusion Inpainting on AMD GPUs

Here is an example python code for the Onnx Stable Diffusion Inpaint Pipeline using huggingface diffusers.

import torch
from PIL import Image
from diffusers import OnnxStableDiffusionInpaintPipeline

pipe = OnnxStableDiffusionInpaintPipeline.from_pretrained("./stable_diffusion_onnx_inpainting", provider="DmlExecutionProvider", revision="onnx", safety_checker=None)

init_image = Image.open("test.png")
init_image = init_image.resize((512, 512))
mask_image = Image.open("mask.png")
mask_image = mask_image.resize((512, 512))
prompt = "Face of a yellow cat, high resolution"

image = pipe(prompt=prompt, image=init_image, mask_image=mask_image, strength=0.75, guidance_scale=7.5).images[0] 
image.save("test-output.png")

Inpaint Images need to be width 512 height 512

You can make an image mask using photopea

Example Txt2Img Script With More Features

  • User is prompted in console for Image Parameters
  • Date/Time, Image Parameters & Completion Time is logged in a Txt File "prompts.txt"
  • Image is saved, named date-time.png (date-time = time image generation was started)
  • User is asked for another prompt or q to quit.
import os
import gc
import sys
import time
import traceback
import numpy as np
from diffusers import OnnxStableDiffusionPipeline
from diffusers import (
    DDPMScheduler,
    DDIMScheduler,
    PNDMScheduler,
    LMSDiscreteScheduler,
    EulerDiscreteScheduler,
    EulerAncestralDiscreteScheduler,
    DPMSolverMultistepScheduler,
)

output_folder = "complete"
log_folder = "complete"
models_folder = "model"

def choose_model():
    model=None
    while model == None:
        os.system('cls')
        print('Stable Diffusion Onnx DirectML\nText to Img\n')
        model_root = os.path.realpath(os.path.dirname(__file__))+"\\"+models_folder+"\\"
        model_list = [ item for item in os.listdir(model_root) if os.path.isdir(os.path.join(model_root, item)) ]
        if len(model_list) <= 0 or model_list == None:
            call_quit(
                "No models found.\nPlease place your model folders in "+str(model_root)+" or update the\
'models_folder' variable in this script")
        model_choices = "Avalible Models\n"
        x = 1
        for i in model_list:
            model_choices += str(x) + " (" + str(i) + ")\n"
            x += 1
        model_choices += "Please Choose a Model#: (or q to quit): "
        user_input_model = input(model_choices)
        if user_input_model == "q":
            call_quit("Quit Called, Script Ended")
        if user_input_model.isnumeric():
            if int(user_input_model) >= 0 and int(user_input_model) <= len(model_list):
                model = str(model_root)+str(model_list[int(user_input_model)-1])
        else:
            model = None
    return model

def choose_scheduler(model):
    sched=None
    scheduler_list = [
        [1,DDPMScheduler.from_pretrained(model, subfolder="scheduler"),"DDPMScheduler"],
        [2,DDIMScheduler.from_pretrained(model, subfolder="scheduler"),"DDIMScheduler"],
        [3,PNDMScheduler.from_pretrained(model, subfolder="scheduler"),"PNDMScheduler"],
        [4,LMSDiscreteScheduler.from_pretrained(model, subfolder="scheduler"),"LMSDiscreteScheduler"],
        [5,EulerAncestralDiscreteScheduler.from_pretrained(model, subfolder="scheduler"),
        "EulerAncestralDiscreteScheduler"],
        [6,EulerDiscreteScheduler.from_pretrained(model, subfolder="scheduler"),"EulerDiscreteScheduler"],
        [7,DPMSolverMultistepScheduler.from_pretrained(model, subfolder="scheduler"),
        "DPMSolverMultistepScheduler"],
    ]
    os.system('cls')
    scheduler_choices = "Avalible Schedulers\n"
    for i in scheduler_list:
        scheduler_choices += str(i[0]) + " (" + str(i[2]) + ")\n"
    scheduler_choices += "Please Choose a Scheduler#: (or q to quit): "
    while sched == None:
        os.system('cls')
        print('Stable Diffusion Onnx DirectML\nText to Img\n')
        user_input_sched = input(scheduler_choices)
        if user_input_sched == "q":
            call_quit("Quit Called, Script Ended")
        for i in scheduler_list:
            if user_input_sched == str(i[0]):
                sched = i[1]
                sched_txt = str(i[2])
    return sched_txt, sched;

def loadPipe(model=None, sched=None, sched_txt=None, provider="DmlExecutionProvider"):
    pipe = None
    if model == None:
        model = choose_model()
    if sched_txt == None and sched == None:
        sched_txt, sched = choose_scheduler(model)
    os.system('cls')
    pipe = OnnxStableDiffusionPipeline.from_pretrained(
        model,
        revision="onnx",
        provider=provider, 
        safety_checker=None,
        scheduler=sched,
    )
    return model, pipe, sched_txt, sched;

def txt_to_img(prompt, negative_prompt, num_inference_steps, guidance_scale, width, height, seed):
    gen_time = time.strftime("%m%d%Y-%H%M%S")
    rng = np.random.RandomState(seed)
    start_time = time.time()
    image = pipe(
        prompt,
        height,
        width,
        num_inference_steps,
        guidance_scale,
        negative_prompt,
        generator = rng,
        ).images[0]
    image.save("./complete/" + gen_time + ".png")
    del image
    del rng
    gc.collect()
    log_info = "\n" + gen_time + " - Seed: " + str(seed) + " - Gen Time: "+ str(time.time() - start_time) + "s"
    with open('./'+log_folder+'/prompts.txt', 'a+', encoding="utf-16") as f:
        f.write(log_info)

def check_folders(output_folder, log_folder, models_folder):
    output_folder_check = os.path.isdir(output_folder)
    if not output_folder_check: 
        os.makedirs(output_folder)
    log_folder_check = os.path.isdir(log_folder)
    if not log_folder_check:
        os.makedirs(log_folder)
    models_folder_check = os.path.isdir(models_folder)
    if not models_folder_check:
        os.makedirs(models_folder)

def call_quit(msg):
    try:
        del pipe
    except:
        None
    gc.collect()
    os.system('cls')
    sys.exit(str(msg))

check_folders(output_folder, log_folder, models_folder)
user_input = None
prev_height = None
prev_width = None
reload = False
error = ["", False]
model, pipe, sched_txt, sched = loadPipe()
while user_input == None:
    os.system('cls')
    print('Stable Diffusion Onnx DirectML (' + model + ' - ' + sched_txt + ')\nText to Img\n')
    prompt=None
    while prompt == "" or prompt == None:
        prompt = input('Please Enter Prompt (or q to quit): ')
    if prompt != "q":
        negative_prompt = input('Please Enter Negative Prompt (Optional): ')
        variations=None
        while variations == None:
            variations = input('How Many Images? (Optional): ')
            if variations.isnumeric() == False:
                variations = None
            if variations == 0 or variations == "" or variations == None :
                variations = "1"
        num_inference_steps = input('Please Enter # of Inference Steps (Optional): ')
        if num_inference_steps.isnumeric() == False:
            num_inference_steps = 50
        guidance_scale =  input('Please Enter Guidance Scale (Optional): ')
        if guidance_scale.isnumeric() == False:
            guidance_scale = 7.5
        width = input('Please Enter Width 512 576 640 704 768 832 896 960 (Optional): ')
        if width.isnumeric() == False:
            width = 512
        if prev_width != None:
            if prev_width != width:
                prev_width = width
                reload = True
        else:
            prev_width = width
        height = input('Please Enter Height 512 576 640 704 768 832 896 960 (Optional): ')
        if height.isnumeric() == False:
            height = 512
        if prev_height != None:
            if prev_height != height:
                prev_height = height
                reload = True
        else:
            prev_height = height
        seed = input('Please Enter Seed (Optional): ')
        if seed.isnumeric() == False:
            seed = None
        gen_time = time.strftime("%m%d%Y-%H%M%S")
        log_info = "\n" + gen_time + " - Model: " + model + " Scheduler: " + sched_txt
        log_info += "\n" + gen_time + " - Prompt: " + prompt
        log_info += "\n" + gen_time + " - Neg_Prompt: " + negative_prompt
        log_info += "\n" + gen_time + " - Inference Steps: " + str(num_inference_steps) + " Guidance Scale: " \
+ str(guidance_scale) + " Width: " + str(width) + " Height: " + str(height)
        with open('./'+log_folder+'/prompts.txt', 'a+', encoding="utf-16") as f:
            f.write(log_info)
        if seed == "" or seed == None:
            rng = np.random.default_rng()
            seed = rng.integers(np.iinfo(np.uint32).max)
        else:
            try:
                seed = int(seed) & np.iinfo(np.uint32).max
            except ValueError:
                seed = hash(seed) & np.iinfo(np.uint32).max
        seeds = np.array([seed], dtype=np.uint32)
        if int(variations) > 1:
            seed_seq = np.random.SeedSequence(seed)
            seeds = np.concatenate((seeds, seed_seq.generate_state(int(variations) - 1)))
        if reload == True:
            del pipe
            gc.collect()
            model, pipe, sched_txt, sched = loadPipe(model, sched, sched_txt)
            reload == False
        os.system('cls')
        print('Stable Diffusion Onnx DirectML (' + model + ' - ' + sched_txt + ')\nText to Img\n')
        for i in range(int(variations)):
            print(str(i+1) + "/" + str(variations))
            try:
                txt_to_img(str(prompt), str(negative_prompt), int(num_inference_steps), int(guidance_scale), int(width), int(height), int(seeds[i]))
            except KeyboardInterrupt:
                gen_time = time.strftime("%m%d%Y-%H%M%S")
                log_info = "\n" + gen_time + " - Error: Keyboard Interrupt"
                log_info += "\n--------------------------------------------------"
                with open('./'+log_folder+'/prompts.txt', 'a+', encoding="utf-16") as f:
                    f.write(log_info)
                call_quit("CTRL+C Pressed, Script Ended")
            except Exception as e:
                error = [str(e),True]
                gen_time = time.strftime("%m%d%Y-%H%M%S")
                log_info = "\n" + gen_time + " - Error: " + str(e)
                log_info += "\n" + gen_time + " - " + traceback.format_exc()
                with open('./'+log_folder+'/prompts.txt', 'a+', encoding="utf-16") as f:
                    f.write(log_info)
                break
        log_info = "\n--------------------------------------------------"
        with open('./'+log_folder+'/prompts.txt', 'a+', encoding="utf-16") as f:
            f.write(log_info)
        prompt = None
        variations = None
        os.system('cls')
        print('Stable Diffusion Onnx DirectML (' + model + ' - ' + sched_txt + ')\nText to Img\n')
        if error[1] == True:
            print("Image Generation Failed\nError: " + error[0] + "\nSee './" + log_folder + "/prompts.txt' for more info\n")
            error = ["", False]
        change_model = ""
        while change_model == "":
            change_model = input('Change Model? (y/n) or (q to quit): ')
            if change_model == "y" or change_model == "Y":
                del pipe
                model, pipe, sched_txt, sched = loadPipe()
            elif change_model == "q":
                call_quit("Quit Called, Script Ended")
            elif change_model == "n" or change_model == "N":
                change_sched = ""
                while change_sched == "":
                    change_sched = input('Change Scheduler? (y/n) or (q to quit): ')
                    if change_sched == "y" or change_sched == "Y":
                        sched_txt, sched = choose_scheduler(model)
                        if type(pipe.scheduler) is not type(sched):
                            pipe.scheduler = sched
                    elif change_sched == "q":
                        call_quit("Quit Called, Script Ended")
            else:
                change_model = ""
    else:
        call_quit("Quit Called, Script Ended")

Output

prompts.txt

10232022-233730 - Model: ./stable_diffusion_onnx
10232022-233730 - Prompt: cat
10232022-233730 - Neg_Prompt: dog
10232022-233730 - Inference Steps: 50 Guidance Scale: 7.5 Width: 512 Height: 512
10232022-233730 - Seed: 22220167420300 - Gen Time: 250.15623688697815s

image

Convert Stable Diffusion model to ONNX format

Some Models are not avalible in Onnx format and will need to be converted.

Install wget for Windows

  1. Download wget for Windows and install the package.
  2. Copy the wget.exe file into your C:\Windows\System32 folder.

Convert Original Stable Diffusion to Diffusers (Ckpt File)

Notes:

  • Change --checkpoint_path="./model.ckpt" to match the ckpt file to convert
  • Change --dump_path="./model_diffusers" to the output folder location to use
  • You will need to run Convert Stable Diffusion Checkpoint to Onnx (see below) to use the model

Convert Stable Diffusion Checkpoint to Onnx

Additional Tools (Optional)

Upscaling

Real-ESRGAN

https://github.com/xinntao/Real-ESRGAN#portable-executable-files-ncnn

Usage: realesrgan-ncnn-vulkan.exe -i infile -o outfile [options]...

  -h                   show this help
  -i input-path        input image path (jpg/png/webp) or directory
  -o output-path       output image path (jpg/png/webp) or directory
  -s scale             upscale ratio (can be 2, 3, 4. default=4)
  -t tile-size         tile size (>=32/0=auto, default=0) can be 0,0,0 for multi-gpu
  -m model-path        folder path to the pre-trained models. default=models
  -n model-name        model name (default=realesr-animevideov3, can be realesr-animevideov3 | realesrgan-x4plus | realesrgan-x4plus-anime | realesrnet-x4plus)
  -g gpu-id            gpu device to use (default=auto) can be 0,1,2 for multi-gpu
  -j load:proc:save    thread count for load/proc/save (default=1:2:2) can be 1:2,2,2:2 for multi-gpu
  -x                   enable tta mode"
  -f format            output image format (jpg/png/webp, default=ext/png)
  -v                   verbose output

RealSR ncnn Vulkan

https://github.com/nihui/realsr-ncnn-vulkan

Usage: realsr-ncnn-vulkan -i infile -o outfile [options]...

  -h                   show this help
  -v                   verbose output
  -i input-path        input image path (jpg/png/webp) or directory
  -o output-path       output image path (jpg/png/webp) or directory
  -s scale             upscale ratio (4, default=4)
  -t tile-size         tile size (>=32/0=auto, default=0) can be 0,0,0 for multi-gpu
  -m model-path        realsr model path (default=models-DF2K_JPEG)
  -g gpu-id            gpu device to use (-1=cpu, default=0) can be 0,1,2 for multi-gpu
  -j load:proc:save    thread count for load/proc/save (default=1:2:2) can be 1:2,2,2:2 for multi-gpu
  -x                   enable tta mode
  -f format            output image format (jpg/png/webp, default=ext/png)

SRMD ncnn Vulkan

https://github.com/nihui/srmd-ncnn-vulkan

Usage: srmd-ncnn-vulkan -i infile -o outfile [options]...

  -h                   show this help
  -v                   verbose output
  -i input-path        input image path (jpg/png/webp) or directory
  -o output-path       output image path (jpg/png/webp) or directory
  -n noise-level       denoise level (-1/0/1/2/3/4/5/6/7/8/9/10, default=3)
  -s scale             upscale ratio (2/3/4, default=2)
  -t tile-size         tile size (>=32/0=auto, default=0) can be 0,0,0 for multi-gpu
  -m model-path        srmd model path (default=models-srmd)
  -g gpu-id            gpu device to use (default=0) can be 0,1,2 for multi-gpu
  -j load:proc:save    thread count for load/proc/save (default=1:2:2) can be 1:2,2,2:2 for multi-gpu
  -x                   enable tta mode
  -f format            output image format (jpg/png/webp, default=ext/png)

Image Editing

ImageMagick

Use ImageMagick® to create, edit, compose, or convert digital images. It can read and write images in a variety of formats (over 200) including PNG, JPEG, GIF, WebP, HEIC, SVG, PDF, DPX, EXR and TIFF. ImageMagick can resize, flip, mirror, rotate, distort, shear and transform images, adjust image colors, apply various special effects, or draw text, lines, polygons, ellipses and Bézier curves.

https://imagemagick.org/script/index.php

Photopea

Photopea is a web-based photo and graphics editor by Ivan Kuckir. It is used for image editing, making illustrations, web design or converting between different image formats. Photopea is advertising-supported software. It is compatible with all modern web browsers, including Opera, Edge, Chrome, and Firefox. The app is compatible with raster and vector graphics, such as Photoshop’s PSD as well as JPEG, PNG, DNG, GIF, SVG, PDF and other image file formats. While browser-based, Photopea stores all files locally, and does not upload any data to a server.

https://www.photopea.com/

FAQs

How can I clear cached models?

huggingface-cli scan-cache --dir ~/.cache/huggingface/diffusers
huggingface-cli delete-cache --dir ~/.cache/huggingface/diffusers

Can I download and install ort-nightly-directml instead of onnxruntime-directml?

Yes and it can provide better image generation times.

You can download the nightly onnxruntime-directml release from the link below

https://aiinfra.visualstudio.com/PublicPackages/_artifacts/feed/ORT-Nightly/PyPI/ort-nightly-directml/versions/

Run python --version to find out, which whl file to download.

Which file should I download?

  • If you are on Python3.7, download the file that ends with **-cp37-cp37m-win_amd64.whl.
  • If you are on Python3.8, download the file that ends with **-cp38-cp38m-win_amd64.whl
  • If you are on Python3.9, download the file that ends with **-cp38-cp38m-win_amd64.whl
  • etc. etc.
pip install replace_with_the_file_you_downloaded.whl --force-reinstall
pip install protobuf==3.20.1

How do you install or use diffrent models?

Instructions for converting models to the Onnx format are available at: https://gist.github.com/averad/256c507baa3dcc9464203dc14610d674#convert-stable-diffusion-model-to-onnx-format

If the model you want to use is already in the Onnx format, you need to adjust the pipe to call the model you want to use:

Example:

pipe = OnnxStableDiffusionPipeline.from_pretrained("./stable_diffusion_onnx", provider="DmlExecutionProvider", safety_checker=None)

In the above pipe example, you would change ./stable_diffusion_onnx to match the model folder you want to use.

If you want to load an Onnx Model directly from the Huggingface website and cache it in your virtual environment (sd_env), adjust the pipe as follows:

Example:

pipe = OnnxStableDiffusionPipeline.from_pretrained("lambdalabs/sd-pokemon-diffusers", revision="onnx", provider="DmlExecutionProvider", safety_checker=None)

Note: Loading models directly from the hugging face website requires running huggingface-cli login and enter the requested token information.

@fdwr
Copy link

fdwr commented Jul 11, 2023

PyTorch 2 did not work for me. So for anyone else seeing a bewildering torch.onnx.errors.UnsupportedOperatorError: Exporting the operator 'aten::scaled_dot_product_attention' to ONNX opset version 14 is not supported, use PyTorch 1.13, like OLive does in its requirements.txt.

Failed for me:

torch==2.0.1
diffusers==0.18.1
onnx==1.14.0
onnxruntime==1.15.0
onnxruntime-directml==1.15.0
numpy==1.24.3

Worked for me:

torch==1.13.1
diffusers==0.17.1   # 0.18.1 would probably work too, as the PyTorch version is the bigger factor
onnx==1.14.0
onnxruntime==1.15.0
onnxruntime-directml=1.15.0
numpy==1.21.6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment