Skip to content

Instantly share code, notes, and snippets.

@Birch-san
Created April 9, 2023 10:50
Show Gist options
  • Save Birch-san/0a35d1d7ae88c46b551bf60cf0ae0a1e to your computer and use it in GitHub Desktop.
Save Birch-san/0a35d1d7ae88c46b551bf60cf0ae0a1e to your computer and use it in GitHub Desktop.
Compute size of buffer required to fit q_proj @ k_proj.T attention scores
float_width=2 # float16
cond_count=2 # uncond and cond for 1 sample
attn_heads=8 # SD1.5 isn't optimized for flash attn, so all layers have 8 heads, lol
vae_scale_factor=8
px_height=px_width=768
latent_height=px_height/vae_scale_factor
latent_width=px_width/vae_scale_factor
q_proj_tokens=k_proj_tokens=latent_height*latent_width
qk_bytes = cond_count*attn_heads*float_width*q_proj_tokens*k_proj_tokens
qk_mb = qk_bytes/1024**2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment