Skip to content

Instantly share code, notes, and snippets.

@thenolifer
Last active July 5, 2023 16:23
Show Gist options
  • Save thenolifer/0614d4eddb09a924684441a65e276c2e to your computer and use it in GitHub Desktop.
Save thenolifer/0614d4eddb09a924684441a65e276c2e to your computer and use it in GitHub Desktop.
Guide to getting the most out of The Ally's Stable Diffusion + Waifu Diffusion + Leaked NAI merged Model - Last updated 11/10/2022

The Ally's SD+WD+NAI Model Guide

Please note - this guide will no longer be updated here, and is now maintained at https://www.notion.so/theally/The-Ally-s-SD-WD-NAI-Model-Guide-de10c88e81e7456c82245663e2b06f10

1. Downloading the .ckpt Model

The .ckpt model is available for download here;

1.1. Updates!

Why haven't I remixed this model with SD 1.5? Two reasons. I never knew the original merge ratio of 1.4 and Waifu Diffusion for one, and two, Voldy/Auto1111's UI no longer supports Sigmoid interpolation merges. It could all be worked out, but... no time.

Note that Danbooru tags DO NOT need underscores for spaces, as I have them written below. We found that it makes zero difference; skip em.

2. What is this Model?

This model is a checkpoint merge of Stable Diffusion 1.4, Waifu Diffusion 1.2 (ratio unknown), and a Sigmoid interpolation (0.5 strength) of NovelAI's model.

3. What's so cool about this Model? And what does it do that other models don't?

NovelAI is trained on a dataset of images from Danbooru, a 2D Hentai art site (NSFW!). This is significant because Danbooru images are categorized with Danbooru Tags, keywords describing every aspect of an image, including clothing, style, and pose. These Danbooru tags can be referenced in your prompt to great effect. The tags are particularly specific, and can be combined with the natural flow of a standard Stable Diffusion prompt, to fine-tune your image.

Some things this merge does better than the standard SD model are;

  • natural bare feet

ft2

  • shoes, including high heels and boots

boots

  • Complex poses, including (squatting), (looking_at_viewer), etc

pose_high_heels

  • Realism (while trained on anime/hentai images, it is perfectly capable of outputting realistic human faces and bodies) polaroidout

4. Using Danbooru Tags

There are over 20,000 Danbooru tags, and ~80% of those tested so far have a marked effect when added to a prompt.

Danbooru Tag Search

To use a tag in your prompt, you must reference the tag exactly as shown - if it is presented in the tag search with an underscore, it must use an underscore in your prompt.

Additionally, Danbooru tags benefit from emphasis ( ) in the prompt; they are keywords to enhance specific elements of your image, and should stand out. () adds emphasis to a term, [] decreases emphasis, both by a factor of 1.1. You can either stack ()/[] for increasing/decreasing emphasis on a particular keyword.

5. Show me examples!

First, an image created purely using Danbooru tags;

(1girl), (hair_ribbon), (side_ponytail), (floral_print) (crop_top), (simple_background) 1665120307014-859681144-(1girl), (hair_ribbon), (side_ponytail), (floral_print) (crop_top), (simple_background)

Now, we take a more standard SD prompt, and incorporate the same Danbooru tags to add those specific elements;

hyperrealistic (1girl) portrait of Shakira with a (hair_ribbon) and (side_ponytail) wearing a (floral_print) (crop_top), on a (simple_background), photo realistic, artstation, 4k, award winning, art by greg rutkowski

1665120307017-3552924571-hyperrealistic (1girl) portrait of Shakira with a (hair_ribbon) and (side_ponytail) wearing a (floral_print) (crop_top), on a (s

The tags also work for landscape images to good effect;

beautiful landscape with a (mountainous_horizon), (light_rays), ((waterfall)), magnificent, luxury, detailed, sharp focus, low angle, high detail, volumetric, illustration, cold lighting, by jordan grimmer and greg rutkowski, trending on artstation, pixiv, Canon EOS 5D1

1665120307020-210477274-beautiful landscape with a (mountainous_horizon), (light_rays), ((waterfall)), magnificent, luxury, detailed, sharp focus, low a

5. Negative Prompts and "ideal" Image Generation Settings

NovelAI's model which powers their site uses a CFG of 10, Euler a Sampler, and 20-30 Steps, and this translates over into my model to good effect. Lower CFG and Step values can also produce impressive results.

NovelAI's default negative prompt is as follows, and you can add to it as required:

lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry

6. Resources

Danbooru have an AI Tag Search feature, which shows examples of some of the top tags, as images, generated by AI models (not this particular model). Beware, super NSFW.

Danbooru AI Tag Search

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment