h3r2tic/restir-meets-surfel-lighting-breakdown.md

## restir-meets-surfel-lighting-breakdown.md

      
    Raw
  

              restir-meets-surfel-lighting-breakdown.md
            
          
    A quick breakdown of lighting in the restir-meets-surfel branch of my renderer, where I revive some olde surfel experiments, and generously sprinkle ReSTIR on top.
General remarks

Please note that this is all based on work-in-progress experimental software, and represents a single snapshot in development history. Things will certainly change 😛
Due to how I'm capturing this, there's frame-to-frame variability, e.g. different rays being shot, TAA shimmering slightly. Some of the images come from a dedicated visualization pass, and are anti-aliased, and some show internal buffers which are not anti-aliased.
Final images

Path tracing, ~1800 paths/pixel:

Path tracing, ~9000 paths/pixel:

Real-time result with per-face normals (matching the path tracer):

Real-time result with smooth normals and normal maps (used in the rest of the breakdown):

Lighting components

Direct lighting:

Indirect lighting:

Indirect diffuse + specular:

Indirect diffuse:

Indirect specular:

Diffuse GI breakdown

There are multiple temporal feedback points in the diffuse GI, so not a single clear entry point. Let's start with the multi-bounce solution.
Multi-bounce

This is "surfels", almost exactly like in PICA PICA, so I'm not going to go into too much detail here. They are allocated from the camera's gbuffer, have a position, normal, and irradiance, and look sort of like this:

The surfels (irradiance cache points, really) trace 4 cosine-distributed rays every frame, and linearly accumulate up to 32 samples, after which they start exponentially blending to be temporally reactive.
As a result, they are quite splotchy. I'll need to make them use fancier temporal integration or ReSTIR later. Note that they are not normally sampled by primary rays; this is just debug visualization.

The surfel rays are allowed to sample other surfels at hit points (with race conditions and all), thus making this a (crappy) radiosity solver.
Final gather

I shoot hemisphere-distributed rays (no cosine weighing, following the ReSTIR GI paper) from the g-buffer at half-res. Blue noise, of course. They sample surfel lighting for multi-bounce, and are also allowed to sample the last frame's screen-space diffuse lighting if the ray hit is on-screen.

This is then thrown at temporal ReSTIR. It turned out I lied on twitter, by saying I don't continuously use permutation sampling, but the following image clearly shows that I do. There's one sample from the center pixel, and one from a neighbor in steady state. The incident radiance values chosen by reservoirs (1 spp, but temporally accumulated) look like this:

Note that when the neighbors are sampled, a tiny screen-space raymarch (2 depth taps) is done to minimize leaking.
I clamp the M of sampled reservoirs at 8, but due to the additional neighbor sampling, it can become twice that in the output.
When M is low, this pass is allowed to continue sampling neighbors, up to 5. This speeds up convergence of newly disoccluded stuff.
After 1 frame:

After 2 frames:

After those, I run two passes of spatial resampling, tuned differently wrt cutoff thresholds and kernel radius. The first one uses 8 spatial neighbors, and the second one uses 5. Once again, there's a screen-space raymarch to reduce bleeding (3 taps).
I don't have a good visualization of what happens in the spatial passes right now since they only output packed reservoir data.
Finally, the half-res reservoirs are thrown at a full-res pass which integrates all the diffuse lighting:

One potentially weird thing I do here, is I use both the ReSTIR input as well as the raw ray-trace results from the current frame here. Not everywhere though -- that would be noisy for no good reason. I'm probably still doing things wrong, but I found that it's difficult to tweak the spatial resampling passes in a way which minimizes noise and keeps contact detail... But the raw raytrace input is not really that noisy with very short rays, so I do the near field via raw ray-tracing, and the far field via ReSTIR. Images in a sec.
This is how the single frame resolved image looks like:

And this is the same thing with some temporal filtering on top:

This one uses TAA-style color bounding box clamping, and changes its parameters based on a moving variance estimate.
Now the same thing, but without the near field split. It may be tricky to see what's going on here with the temporal flicker, but notice the darker corners, and some missing micro-bounce, e.g. on the chairs by the table to the left:

Here's just the near field to make this clearer. Note that the near-far split is based on screen-space distance, as it aims to reduce detail loss from screen-space filters.

An additional small spatial filter cleans ups some of the noise:

This is basically it for the diffuse.
Reflections

I start with ray-tracing from the g-buffer at half-res again. BRDF sampling with slight bias (I cut off 5% of the spec lobe to get GGX's tail under control). Blue noise, VNDF sampling. Those are also allowed to sample the screen, and pretend all surfaces are diffuse. It would be nice to do multi-bounce of spec, but haven't gotten there yet.

Those are then thrown at temporal-only ReSTIR (haven't implemented spatial here yet), creating reservoirs with incident radiance sort of like this:

Then a full-resolution image is created by using the reservoir samples in a ratio estimator, as in my Frostbite and SEED talks. This one uses 8 spatial samples, with the spatial kernel fit to the spec lobe (following Dmitry Zhdan). Note that the following includes BRDF FG terms, so I had to scale it up arbitrarily so it wouldn't be mostly black.

A temporal filter using color bbox clamping cleans it up a bunch:

I also run a small spatial cleanup filter if sample count is low, but it's not visible here.
... and that is it for the reflections.
Misc

Besides this, there's also a world radiance cache like in Lumen (though waaaaaaaaaay incomplete at this point). It's mostly an optimization for me right now, and currently introduces some bias, so wasn't used in the breakdown. It looks sort of like this:

It does use some temporal filtering, so should provide variance reduction, but I ruin that via stochastic interpolation of the volumetric cache points 😜
Oh, and one more funny thing I do is calculate screen-space-sized (not scaling with camera distance) GTAO like this:

... and then use it as a feature guide in various filters in addition to normal- and deph-based guides. Helps preserve contact details.