paniq/paralleldelaunay.rst

## paralleldelaunay.rst

      
    Raw
  

              paralleldelaunay.rst
            
          
    Parallelizable Gather-based Delaunay Transforms

by Leonard Ritter, Duangle GbR
Today I figured out how to do GPU-friendly Delaunay transforms, and I'd like to
describe how the algorithm works before I forget everything.
What the algorithm does is to flip edges in a half-edge triangle mesh where a flip
would improve the quality of the adjacent triangles (an equilateral triangle
has perfect quality). The triangle mesh I'm using is embedded in R2, that is,
it's on a 2D plane, has no holes, is not a manifold, and is typically a heavily
subdivided square; but the algorithm should very well work for 2-manifolds in 3D too.
The mesh is not guaranteed to be Delaunay conforming after a single transform,
and there is no halting condition. The algorithm is designed to be run in
realtime, continuously refining a mesh with moving vertices each frame. The
algorithm supports constrained and unconstrained Delaunay transforms.

Overview

There are three passes to this algorithm, inspired by the approach of
Navarro et al in [1]. Navarro et al's approach is scatter-based and requires atomics
to work, but was a good starting point for figuring out how a lo-fi version
might work. Each pass can be massively parallelized (as the iteration
is order-independent), requires no GPGPU implementation or compute shaders and
therefore is optimal for OpenGL 3.2 compatible hardware.
The first pass iterates all edges of the mesh and evaluates flip candidates based
on whether a flip would improve the quality. The second pass iterates all
triangles and does the flip, using a simple precedence heuristic to avoid
collisions. The third pass fixes broken adjacency information.

Data Structure

The data structure I use for the triangle mesh is a reduced version of the
structure described by Fabian Giesen in his Half-edges Redux article [2].
Each triangle has only three edges sorted in counter-clockwise order so that
edgeId(triangleId, corner) = triangleId * 3 + corner. That makes it possible
to store the vertex indices for each edge (which map to the vertex opposite
each edge inside the same triangle, not each edge's starting vertex) as a
linear element array, the kind of format that OpenGL prefers for indexed
triangles. There are no wedges in this structure, and the algorithm needs to be
modified if there are.
I also store an array of opposites for each vertex (which I will call twins for
brevity), so that vertex(twin(edgeId)) gives the vertex opposite each edge
in the adjacent triangle. That means each potential quad that an edge is the
diagonal of is described by the vertices vertex(edgeId),
vertex(next(edgeId)), vertex(twin(edgeId)) and vertex(prev(edgeId))
in counter-clockwise order. The algorithm needs the twin array to work,
as well as a temporary copy of the twin array.
Here is a breakdown of the data structure:
V vertices
T triangles
E=3*T half-edges

// position[vertexId] = position of each vertex in R2
// Read in pass 1.
vec2 position[V];

// vertex[edgeId] = vertexId of inner vertex opposite a half-edge
// Read in passes 1 & 2, written in pass 2.
uint vertex[E];

// twin[edgeId] = edgeId of the twin half-edge connecting the same vertices
// Read in passes 1, 2 & 3, written in pass 3.
uint twin[E];

// temporary flag array, three bits per triangle (one bit per half-edge).
// Written in pass 1, read in pass 2.
uint8 candidates[T];

// temporary twin array to avoid a race condition.
// Written in pass 2, read in pass 3.
uint tmptwin[E];


Algorithm

The algorithm needs three passes to perform, which are as follows:

For each triangle, for each of its half-edges, collect the four adjacent
vertices that form a quad with this half-edge as its diagonal and estimate the
quality of triangles with both possible diagonals. I use a quality estimator
described by Müller et al. in [3]. If a flip would improve the quality,
store a bit for this edge in candidates. By default, candidates[triangleId]
must be initialized to zero.
It is important that the quality computations use the exact vertex order from
both sides of each edge to compute the exact same floating point values and
arrive at the exact same predicate, or floating point imprecisions will cause
asymmetrical results, and subsequent steps will destroy the mesh. I got good
results with a fixed point implementation of the quality estimator that is
insensitive to vertex order.
If the transform is supposed to be constrained, then constrained edges should
simply never be tagged.

For each triangle, find the first half-edge that is tagged by candidates
as a flip candidate. If no half-edge is tagged, then skip. If more
than one half-edge is tagged, the tagged half-edge with the lowest lexicographic
index wins.
The lexicographic index is an uint64 built from the two half-edge ID's
that make up a single edge.
index(edgeId) = min(edgeId,twin[edgeId]) | (max(edgeId,twin[edgeId]) << 32).
The half-edge candidate can still collide with tagged half-edges in the
neighboring triangle. If its lexicographic index is not the lowest among its
tagged neighbors, then skip. Otherwise, do this triangle's part of the flip:
The flip is executed by mapping the half-edge's end vertex to the
outer opposing vertex so that vertex[prev(edgeId)] = vertex[twin[edgeId]].
Because this moves two half-edges, the local twin information must also be
corrected, so that tmptwin[next(edgeId)] = next(twin[edgeId]) and
tmptwin[edgeId] = twin[next(twin[edgeId])].
Because both triangles participating in this flip arrive at the same conclusions,
do the same operation independently, and the flip operation itself is symmetric,
the connectivity will be restored without collision or ambiguity.
If no operation is performed, tmptwin must still be initialized so that
tmptwin[edgeId] = twin[edgeId].

There's one last problem to correct, which is the invalid adjacency information
stored in half-edges bordering on triangles that have been flipped.
For each half-edge, determine if its twin has changed by comparing
tmptwin[edgeId] != twin[edgeId]. If the twin has changed, then we simply
update twin[edgeId] = tmptwin[edgeId].
We also check if this half-edge's information is still up to date by
verifying the bi-directionality of the link using the predicate
edgeId == tmptwin[tmptwin[edgeId]]. If the predicate fails, then we have
to search for the new, correct adjacent half-edge, whose location is
fortunately invariant.
We now need to discern whether edgeId does belong to a half-edge
bordering on a flipped triangle using the predicate
tmptwin[tmptwin[tmptwin[edgeId]]] == tmptwin[edgeId].
If the comparison holds true, we update the half-edge information to the
invariant location of its new half-edge so that
twin[edgeId] = prev(tmptwin[tmptwin[edgeId]]), and we're done.


References


[1] Navarro et al.
A parallel GPU-based algorithm for Delaunay edge-flips


[2] Fabian Giesen
Half-edges redux


[3] Müller et al.
Air Meshes for Robust Collision Handling