Skip to content

Instantly share code, notes, and snippets.

@geocine
Last active October 7, 2023 14:35
Show Gist options
  • Star 12 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save geocine/e51fcc8511c91e4e3b257a0ebee938d0 to your computer and use it in GitHub Desktop.
Save geocine/e51fcc8511c91e4e3b257a0ebee938d0 to your computer and use it in GitHub Desktop.
Dreambooth Training in WIndows no WSL, 12GB VRAM

Credits

kohya_ss, ShivamShrirao, bmaltais, C43H66N12O12S2, DeXtmL, nitrosocke

Based on my testing this setup consumes around 11425MiB / 12282MiB of VRAM provided all other GPU consuming applications are closed.

Installation

You must have git installed to continue. Clone the kohya_ss repo

git clone https://github.com/bmaltais/kohya_ss.git

We need bitsandbytes_windows folder inside this repo, so we will clone it to the current directory. Let's leave this for a while and clone the ShivamShiraro repo

git clone https://github.com/ShivamShrirao/diffusers.git 

So you now have 2 folders in your directory kohya_ss and diffusers

Copy the bitsandbytes_windows folder inside kohya_ss folder to diffusers/example/dreambooth folder. You should now have a folder structure like this

└── diffusers
    └── example
        └── dreambooth
            ├── concepts_list.json
            ├── DreamBooth_Stable_Diffusion.ipynb
            ├── launch.sh
            ├── launch_inpaint.sh
            ├── README.md
            ├── requirements.txt
            ├── requirements_flax.txt
            ├── train_dreambooth.py
            ├── train_dreambooth_flax.py
            ├── train_dreambooth_inpainting.py
            └── bitsandbytes_windows

Replace the contents of requirements.txt with the following

accelerate
transformers>=4.21.0
ftfy
albumentations
tensorboard
modelcards

You must have Python 3.10.6 installed on your system. Open Command Prompt and run the following commands

python -m venv --system-site-packages venv
.\venv\Scripts\activate
pip install torch==1.12.1+cu116 torchvision==0.13.1+cu116 --extra-index-url https://download.pytorch.org/whl/cu116
pip install git+https://github.com/ShivamShrirao/diffusers.git
pip install -U -r requirements.txt
pip install OmegaConf
pip install pytorch_lightning
pip install -U -I --no-deps https://github.com/C43H66N12O12S2/stable-diffusion-webui/releases/download/f/xformers-0.0.14.dev0-cp310-cp310-win_amd64.whl
pip install bitsandbytes==0.35.0
copy .\bitsandbytes_windows\*.dll .\venv\Lib\site-packages\bitsandbytes\
copy .\bitsandbytes_windows\cextension.py .\venv\Lib\site-packages\bitsandbytes\cextension.py
copy .\bitsandbytes_windows\main.py .\venv\Lib\site-packages\bitsandbytes\cuda_setup\main.py

Type accelerate config and answer the following propmpts in this order:

accelerate config

NOTE: Redirects are currently not supported in Windows or MacOs.
In which compute environment are you running? ([0] This machine, [1] AWS (Amazon SageMaker)): 0
Which type of machine are you using? ([0] No distributed training, [1] multi-CPU, [2] multi-GPU, [3] TPU [4] MPS): 0
Do you want to run your training on CPU only (even if a GPU is available)? [yes/NO]: NO
Do you want to use DeepSpeed? [yes/NO]: NO
What GPU(s) (by id) should be used for training on this machine as a comma-seperated list? [all]: all
Do you wish to use FP16 or BF16 (mixed precision)? [NO/fp16/bf16]: fp16

Note all is case sensitive. You should type it as it is

Create a file called launch.bat with the following contents

accelerate launch --num_cpu_threads_per_process 6 train_dreambooth.py ^
  --pretrained_model_name_or_path="runwayml/stable-diffusion-v1-5" ^
  --pretrained_vae_name_or_path="stabilityai/sd-vae-ft-mse" ^
  --output_dir="./output" ^
  --with_prior_preservation --prior_loss_weight=1.0 ^
  --seed=494481440 ^
  --resolution=512 ^
  --train_batch_size=1 ^
  --train_text_encoder ^
  --mixed_precision="fp16" ^
  --use_8bit_adam ^
  --gradient_checkpointing ^
  --gradient_accumulation_steps=1 ^
  --learning_rate=1e-6 ^
  --lr_scheduler="constant" ^
  --lr_warmup_steps=0 ^
  --num_class_images=200 ^
  --sample_batch_size=4 ^
  --max_train_steps=800 ^
  --save_interval=200 ^
  --save_sample_prompt="photograph zwx person" ^
  --concepts_list="concepts_list.json" ^
  --n_save_sample=5 ^
  --save_min_steps=800
pause

Training

Edit concepts_list.json and add your own concepts.

[
    {
        "instance_prompt":      "photo of zwx person",
        "class_prompt":         "photo of person",
        "instance_data_dir":    "../../../data/zwx",
        "class_data_dir":       "../../../data/person"
    }
]

Note paths should NOT use \ instead use /

You place your training images to instance_data_dir . The class_data_dir images will be automatically generated based on the class you are training. For more detailed explanation visit the original notebook

Open the terminal and activate the venv and run the script

.\venv\Scripts\activate
launch.bat

Converting Diffusers to CKPT

On the diffusers repo go to diffusers/scripts/ folder. Open a terminal and run the following commands

convert_diffusers_to_original_stable_diffusion.py --model_path <folder input path> --checkpoint_path <model output path> --half
Name Description
model_path Is a folder as diffusers model outputs as a folder. In this case it would be ./output as defined in the --output_dir parameter
checkpoint_path Is the path where you want to save your .ckpt file. This path should include the filename.ckpt eg. /models/filename.cpkt

Where to go from here?

https://github.com/nitrosocke/dreambooth-training-guide

@kalkal11
Copy link

kalkal11 commented Nov 6, 2022

hey, I have kohya running no probs already and wanted to try this but ran into problems. After some head scratching I realised terminal wasn't liking the copy command used for bits and bytes.

Changing to the following fixed the problem and it's all running now -

pip install bitsandbytes==0.35.0
cp .\bitsandbytes_windows\*.dll .\venv\Lib\site-packages\bitsandbytes\
cp .\bitsandbytes_windows\cextension.py .\venv\Lib\site-packages\bitsandbytes\cextension.py
cp .\bitsandbytes_windows\main.py .\venv\Lib\site-packages\bitsandbytes\cuda_setup\main.py

Thanks for the guide :)

@if-ai
Copy link

if-ai commented Nov 6, 2022

Needs this from bmaltais repo
Open an administrator powershell window
Type Set-ExecutionPolicy Unrestricted and answer A
Close admin powershell window

@kalkal11
Copy link

kalkal11 commented Nov 6, 2022

true @jvoxel - I'd already done this due to the kohya install in my case

@geocine
Copy link
Author

geocine commented Nov 6, 2022

@caacoe @jvoxel. Thanks both. This guide does not mention Powershell anywhere and just uses plain old Command Prompt o avoid making it too complicated for others where they need to do the ExecutionPolicy change on Powershell

@ReflectionRip
Copy link

ReflectionRip commented Nov 7, 2022

It is not putting my bitsandbytes in venv. Instead it is putting it in %AppData%
Never mind... Don't use bash.
It also looks like if you already have stuff installed in %AppData% it will not install it in venv. Not sure if that is a good thing or not.

@geocine
Copy link
Author

geocine commented Nov 7, 2022

Yes @ReflectionRip , as stated above this guide states/assumes that user only uses Command Prompt. Also, you are free to use more powerful environments like conda. This guide is meant for people who haven't touched python.

@kalkal11
Copy link

kalkal11 commented Nov 7, 2022

Yes @ReflectionRip , as stated above this guide states/assumes that user only uses Command Prompt.

@geocine ran into further issues later, I'd strongly recommend noting this at the start of the guide. I won't pretend to be an expert, I had no idea I was running into issues because of using powershell!

Note there is a line that suggests Open the terminal - default cli for terminal in Windows 11 is Powershell

Thanks for your effort in putting this together again.

@BraganKimaru
Copy link

Hello, I tried following this guide, however in the training step when executing the launch.bat I'm getting this error, do you have any idea on how to solve it? Thanks in advance

Traceback (most recent call last): File "C:\Users\braga\miniconda3\lib\runpy.py", line 196, in run_module_as_main return run_code(code, main_globals, None, File "C:\Users\braga\miniconda3\lib\runpy.py", line 86, in run_code exec(code, run_globals) File "C:\Users\braga\diffusers\examples\dreambooth\venv\Scripts\accelerate.exe_main.py", line 4, in File "C:\Users\braga\diffusers\examples\dreambooth\venv\lib\site-packages\accelerate_init.py", line 7, in from .accelerator import Accelerator File "C:\Users\braga\diffusers\examples\dreambooth\venv\lib\site-packages\accelerate\accelerator.py", line 25, in import torch File "C:\Users\braga\diffusers\examples\dreambooth\venv\lib\site-packages\torch_init.py", line 140, in raise err OSError: [WinError 126] The specified module could not be found. Error loading "C:\Users\braga\diffusers\examples\dreambooth\venv\lib\site-packages\torch\lib\torch_python.dll" or one of its dependencies.

@geocine
Copy link
Author

geocine commented Dec 2, 2022

Seems like you did not install pytorch on venv, maybe you only installed it on conda

@wcarletsdrive
Copy link

Hey Geocine, if we reinstall this and since kohya is updated for 2.0 support, will we not have cuda errors if we use 512 base 2.0 model? Because I am getting cuda errors on the changes kohya made before his recent ones for 2.0

@NeoAnthropocene
Copy link

NeoAnthropocene commented Dec 7, 2022

Hey Geocine, if we reinstall this and since kohya is updated for 2.0 support, will we not have cuda errors if we use 512 base 2.0 model? Because I am getting cuda errors on the changes kohya made before his recent ones for 2.0

With ShivamShrirao's Dreambooth repo I'm getting the below CUDA error with the "stable-diffusion-2-1-base" (it's 512x512 version)
CUDA_LAUNCH_BLOCKING=1

@shadowbroker2000
Copy link

a1111/dreambooth was broken for me producing garbage results this finally has me getting working models.

A few comments

Before
python -m venv --system-site-packages venv

Make sure to go into the diffusers/examples/dreambooth directory
I found I need to run the
.\venv\Scripts\activate.bat

Now make sure you have (venv) at the start of command line eg. "(venv) D:\ai\diffusers\examples\dreambooth>" otherwise you will be installing with pip globally.

You may want to copy
convert_diffusers_to_original_stable_diffusion.py and convert_original_stable_diffusion_to_diffusers.py as they are useful for going from diffusers to ckpt or ckpt back to diffusers.

If you need to convert an existing ckpt file back to diffusers so you can train with it you can use the following command
python convert_original_stable_diffusion_to_diffusers.py --checkpoint_path "d:/ai/sd_models/yourmodel.ckpt" --dump_path "./models/yourmodel_dir"

@geocine
Copy link
Author

geocine commented Jan 22, 2023

I don't have much time to update this but if you guys also want optimizations from cudnn follow these:

  1. Download https://developer.nvidia.com/compute/cudnn/secure/8.6.0/local_installers/11.8/cudnn-windows-x86_64-8.6.0.163_cuda11-archive.zip
  2. Extract and copy the files inside bin to .\venv\Lib\site-packages\torch\lib, paste and replace.

That's it, you should see a boost in it/s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment