Skip to content

Instantly share code, notes, and snippets.

@crowsonkb
Last active May 21, 2023 00:51
Show Gist options
  • Save crowsonkb/3ed16fba35c73ece7cf4b9a2095f2b78 to your computer and use it in GitHub Desktop.
Save crowsonkb/3ed16fba35c73ece7cf4b9a2095f2b78 to your computer and use it in GitHub Desktop.
@torch.no_grad()
def sample_dpmpp_2m_sde(model, x, sigmas, extra_args=None, callback=None, disable=None, eta=1.0, noise_sampler=None, solver_type='midpoint'):
"""DPM-Solver++(2M) SDE."""
if solver_type not in {'heun', 'midpoint'}:
raise ValueError('solver_type must be \'heun\' or \'midpoint\'')
sigma_min, sigma_max = sigmas[sigmas > 0].min(), sigmas.max()
noise_sampler = K.sampling.BrownianTreeNoiseSampler(x, sigma_min, sigma_max) if noise_sampler is None else noise_sampler
extra_args = {} if extra_args is None else extra_args
s_in = x.new_ones([x.shape[0]])
old_denoised = None
h_last = None
for i in trange(len(sigmas) - 1, disable=disable):
denoised = model(x, sigmas[i] * s_in, **extra_args)
if callback is not None:
callback({'x': x, 'i': i, 'sigma': sigmas[i], 'sigma_hat': sigmas[i], 'denoised': denoised})
if sigmas[i + 1] == 0:
# Denoising step
x = denoised
else:
# DPM-Solver++(2M) SDE
t, s = -sigmas[i].log(), -sigmas[i + 1].log()
h = s - t
eta_h = eta * h
x = sigmas[i + 1] / sigmas[i] * (-eta_h).exp() * x + (-h - eta_h).expm1().neg() * denoised
if old_denoised is not None:
r = h_last / h
if solver_type == 'heun':
x = x + ((-h - eta_h).expm1().neg() / (-h - eta_h) + 1) * (1 / r) * (denoised - old_denoised)
elif solver_type == 'midpoint':
x = x + 0.5 * (-h - eta_h).expm1().neg() * (1 / r) * (denoised - old_denoised)
x = x + noise_sampler(sigmas[i], sigmas[i + 1]) * sigmas[i + 1] * (-2 * eta_h).expm1().neg().sqrt()
old_denoised = denoised
h_last = h
return x
@crowsonkb
Copy link
Author

DPM++ 2M solvers really don't like the normal VP noise schedule, its second derivative (in terms of log sigma) is large at the end. The most friendly schedule to it is exponential, which is evenly spaced in log sigma (second derivative is zero everywhere). Karras is closer to exponential and in my opinion it is close enough to work well.

@crowsonkb
Copy link
Author

I didn't add second_order to 2M SDE since I'm not sure if it expected the 2x amount of steps.

2M SDE calls the model once per step, like regular 2M.

@crowsonkb
Copy link
Author

@Panchovix
Copy link

Thanks for the complete explanation! And amazing, thanks for the commit!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment