Ellipsoid Frustum Intersection
Yesterday I posted a problem to math stack exchange that bothered me for a while now, and right after I've had a few exchanges on Twitter, I got inspired to attempt a solution.
Here it goes. It's 100% untested but I'm fairly certain that it will work.
The problem is about a form of refining raytracing where we render a big list of convex 3D brushes (and I decided to start with Ellipsoids, since they're so useful) to the screen or a shadow map, without any prebuilt accelleration structure. How does it work? Well, if we had a way to figure out for a portion of the frustum whether it contained a brush, we could
- Start with a very low resolution
- Collect a list of ellipsoids contained in each tile, culling the ones not contained or fully occluded
- Then subdivide each tile into smaller tiles and repeat
- Until we hit per-pixel resolution, then we are done.
And this brings us to the problem: we want to solve for a tile on screen
- Whether an ellipsoid arbitrarily parametrized in world space is overlapping this tile
- What the near and far depth range of the overlapping portion is (important for culling)
- Whether the ellipsoid covers the tile fully (occludes its background completely, also important for culling)
- What the near and far depth range of the portion is that fully covers the tile (so we can also cull overlapping ellipsoids)
You see that the problem is analog to raytracing, but we are frustum tracing: instead of a ray, we extrude and expand a rectangle which forms a 4-sided open-faced pyramid in space, or simpler yet, four planes. The depth values that we are interested in are represented by planes parallel to the screen plane. These planes intersect in each corner of the rectangle to form edge lines. So the participants in this problem are:
- An ellipsoid
- Four wall planes of our frustum
- Four edge lines of our frustum
- A screen plane - whose positions we want to find (outer near/far, inner near/far)
Now the screen plane and the wall plane and edge lines are not necessarily arranged symmetrically and orthogonally, but can be skewed due to the position of the screen tile we are interested in. In the beginning, this sounds like an added complication, but it will be helpful that we don't have an invariant here that we might feel must be protected (and instead get another one that we will profit from)
Let's drop from 3D space to 2D space and attempt to solve the problem there. Now our ellipsoid is just an ellipse, and our frustum is simply a wedge formed by two lines originating from an eye point (analog to our four planes or four edge lines - here they collapse to the same thing) and a screen line (analog to our screen plane) arranged semi-orthogonally to the eye point.
The first thing we can do to considerably simplify the problem is to non-uniformly scale and rotate our problem space to turn the ellipsoid into a unit circle (analog to the 3D raytracing solution which does the same thing). This will skew our wedge further, but since it was already skewed to begin with, nothing really changed for us. This was the invariant I talked about earlier. Now we're solving the problem for a unit circle, and our wedge still is a wedge, but in ellipse-space. We'll need to save our skew matrix for later so we can transform our solution lines back to world space.
We can simplify our problem further by making use of the fact that the circle is invariant to rotation, and rotate the space so that our screen line is axis-aligned (in 2D, I pick a vertical alignment). This saves us a line/line intersection test and allows us to drop the orientation component of our line vector, because we only need its size scalar.
We end up with four possible arrangements of circle and wedge:
A. A circle that crosses both boundary lines - contained, occluding.
B. A circle that crosses only one boundary line - contained, not occluding.
C. A circle that crosses no boundary line and is left of one line and right of the other - contained, not occluding.
D. A circle that crosses no boundary line and is either left or right of both boundary lines - not contained, not occluding.
To resolve these predicates we need several analytical equations, which are readily available:
- A line / circe intersection test that gives us an entry and exit point (usually parametrized as center and width) if there is an intersection, and for any case tells us if the circle's center is left or right of the line.
- Given a line and a circle, compute two lines parallel to the original line that bound the circle and the contact points where lines and circle touch.
With these equations we can deduce our near and far lines (which are parallel to the screen line).
- For case C, our near and far line are simply the ones that bound the circle. Done.
- For case B, the near or far line only bounds the circle if the contact point is inside the wedge. Otherwise it intersects with near or far intersection points of circle and wall line.
- Case A is analog to case B, except we now have two intersection points to consider per line if the contact point is outside the wedge. In this case we pick the nearest or farthest intersection point among each pair.
- For case A, we also need to compute the fully covered range, formed by two screen lines overlapping with the inner intersection points. Both intersections of each screen line with the wall lines must be inside the circle, or we don't have a fully covered range.
After this, we transform the lines back to world space, and that is the 2D case handled.
Let's add one more dimension and observe. Analog to the 2D case, we can rotate and scale the ellipsoid into a unit sphere. Our wall planes, edge lines and screen plane will be transformed to ellipsoid-space, and we will simply transform our solution back later on.
We also make use of the fact that the sphere is invariant to rotation as well, and rotate the space so that our screen plane is axis-aligned (aligning it to Z seems the obvious choice). This saves us a line/plane intersection test and allows us to drop the normal component of our plane, because we only need its
These are our possible cases in 3-space:
A. A sphere that intersects all wall edges - contained, occluding.
B. A sphere that intersects one or more wall planes and is inside the frustum - contained, not occluding.
C. A sphere that intersects no wall planes and is inside the frustum - contained, not occluding.
D. A sphere that intersects no wall planes and is outside the frustum - not contained, not occluding.
These are the analytical equations we need to solve all predicates, and again, this is all standard stuff:
- A sphere / plane intersection test that gives us the parameters of the resulting circle embedded in 3-space, and for any case tells us in which half-space the sphere's center is in.
- A ray / sphere intersection test that gives us entry and exit point if there is an intersection.
- Given a plane and a sphere, compute two planes parallel to the original plane that bound the sphere, and the two contact points where planes and sphere touch.
- Given a plane and a circle, compute two planes parallel to the original plane that bound the circle and the contact point where plane and circle touch. Our screen planes will never be parallel to the circle's plane, so this case does not have to be tested.
With these equations we can deduce our near and far planes (which are parallel to the screen plane).
- For case C, our near and far planes are simply the ones that bound the sphere. Done.
- For case A and B, the near or far plane only bounds the sphere if the contact point is inside the frustum. Otherwise it bounds the circle if the contact point is inside the wedge. Otherwise the plane intersects with near or far intersection points of sphere and edge ray. We need to collect all valid intersections inside the frustum and pick the nearest / farthest to get our final range.
- For case A, we also need to compute the fully covered range, formed by two screen planes overlapping with the inner edge intersection points. All four intersections of each screen plane with the edge rays must be inside the sphere, or we don't have a fully covered range.
After this, we transform the planes back to world space, and are all done.
Limitations and Caveats
The solution won't work for ellipsoids with a zero (disc) or infinite component (cylinder of infinite length) because we can't transform those back to unit spheres (discs will cause frustum walls to become parallel, cylinders will cause them to collapse to a single plane). I guess one could design more robust tests for those cases, or attempt a solution for quadrics in general.
I also didn't cover any tests for ellipsoids behind the screen plane at eye coordinate or for the case when the eye coordinate is inside the ellipsoid. Those need to be handled as well.
Once the algorithm is written down, there are a lot of opportunities to zero or fold duplicate/unused element computations, so I'm fairly certain that it can be crunched down to a small solution that is completely incomprehensible without this document. What I like about this solution is that we get a perfect analytical result without any iterations, and that should translate to potentially hundred thousands if not millions of ellipsoid brushes per frame.
This technique is also valid for orthogonal projection without any changes.
Nothing speaks against following this approach with convex non-uniformly scaled primitives other than ellipsoids, as long as ray and plane intersection/bounding tests are available for the primitive or a slice of it. Spheres are particularly convenient because they are invariant to rotation and their slices are always circles. Cubes produce triangular, quadliteral and hexagonal slices. Tetrahedra produce triangular and quadliteral slices. Cylinders and cones produce circles and capped hyperbolic slices.
It nevertheless helps to know that slices of a convex primitive are also always convex.
Could this be done with signed distance estimators? Since our primitive is convex, we can use faster newton-raphson to converge a ray to the surface, in my experience that's under 7 iterations on average. But one has to come up with new plane bounding estimators that also work with 2D slices of a 3D field, which is just a big question mark to me. On the upside, this technique would also work with non-uniformly scaled distance fields.