sftblw/fix_automatic111_merged_ckpt.md

## fix_automatic111_merged_ckpt.md

      
    Raw
  

              fix_automatic111_merged_ckpt.md
            
          
    fix convert_original_stable_diffusion_to_diffusers.py for fixing merged models

To fix broken model
(mainly merged weights from AUTOMATIC111's no-licensed UI
for Invoke AI
,
one can use CKPT -> HF/diffusers -> CKPT route.
Environment setup

OmegaConf, Pytorch-Lightning, and usual huggingface diffusers environment.
Get scripts

I used scripts at cd91fc06fe9513864fca6a57953ca85a7ae7836e .

CKPT -> HF/diffusers : convert_original_stable_diffusion_to_diffusers.py
HF/diffusers -> CKPT : convert_diffusers_to_original_stable_diffusion.py

why it fails

This fails at first time.
why it fails: main checkpoint

With debugging, I found that loaded checkpoints (by torch.load(CKPT_PATH)) is
essentially a Python dictionary with torch.Tensor s.
(I guess it's a python pickle file, which is serialized python object)
So, main reason is, merged checkpoint is without 'state_dict' key.
# My merged weight looks like this:
merged_weight = {
  'model.blahblah.key': torch.tensor(...)
}

# Another normal working file is like this:
normal_weight = {
  'state_dict': {
    'model.blahblah.key': torch.tensor(...)
  }
}
essentially, HF script discards anything other than 'state_dict',
we can easily adjust to use provided dictionary.
# HF's script
global_step = checkpoint["global_step"]
checkpoint = checkpoint["state_dict"]
why it fails: VAE

VAE is under "first_stage_model." of main ckpt, according to
def convert_ldm_vae_checkpoint(checkpoint, config): of HF's script.
Some models provide external VAE checkpoints, but thoes ckpts' keys do not start with "first_stage_model.".
If you do not provide VAE, it will use some pre-existing VAE in the main ckpt. this is critical some models, especially similar model to NovelAIs.
external VAE's ckpt is also loadable with torch.load(VAE_CKPT_PATH).
modifying convert_original_stable_diffusion_to_diffusers.py

So, by inserting two things, the script will be fixed.
argument add

    parser.add_argument(
        "--vae_checkpoint_path", default=None, type=str, required=False, help="Path to the checkpoint to convert."
    )
after loading checkpoint

    checkpoint = torch.load(args.checkpoint_path) # add below this
    
# +++++++++++++++++++++
    # dummy
    if 'state_dict' not in checkpoint.keys():
        checkpoint = {
            'state_dict': checkpoint
        }
    
    # the script looks for the `global_step` for discriminating between SD v1.x and v2.x
    if 'global_step' not in checkpoint.keys():
        checkpoint['global_step'] = 678999

    global_step = checkpoint["global_step"]
    checkpoint = checkpoint["state_dict"]

    if args.vae_checkpoint_path is not None:
        vae_checkpoint = torch.load(args.vae_checkpoint_path)

        if 'state_dict' in vae_checkpoint.keys():
            vae_checkpoint = vae_checkpoint['state_dict']

        # (updated)
        # overwrite ckpt
        for k, v in vae_checkpoint.items():
            vae_key = "first_stage_model." + k
            checkpoint[vae_key] = v
        vae_checkpoint = None
# +++++++++++++++++++++
HF diffusers -> original SD checkpoint

no work needed, at least for me.
executing

For reference, my execution was like this:
# CKPT -> HF
python scripts/convert_original_stable_diffusion_to_diffusers.py \
  --checkpoint_path ./models_ckpt/mymodel.ckpt \
  --vae_checkpoint_path ./models_ckpt/mymodel_vae.ckpt \
  --dump_path ./models_hf/hf_model_mymodel \
  --scheduler_type euler-ancestral

# HF -> CKPT
python scripts/convert_diffusers_to_original_stable_diffusion.py \
  --model_path ./models_hf/hf_model_mymodel \
  --checkpoint_path ./models_ckpt/mymodel_modified.ckpt
etc


why no PR

=> I'm too lazy


investigation social thread (ko-KR)

# #


windows fail on wget on CKPT -> HF

You'll know that easily


https://gist.github.com/sftblw/ba0e463a8a8ff5d22bb871279ab7dd04