Skip to content

Instantly share code, notes, and snippets.

@ochafik
Last active January 31, 2024 00:27
Show Gist options
  • Save ochafik/6e95596a6e6188b2062ee056b55ce47d to your computer and use it in GitHub Desktop.
Save ochafik/6e95596a6e6188b2062ee056b55ce47d to your computer and use it in GitHub Desktop.
OpenSCAD 3D rendering just got an order of magnitude faster. Here's how and what may come next.

Original link: https://ochafik.com/jekyll/update/2022/02/09/openscad-fast-csg-contibution.html#cgal-nef-polyhedra--gmp--so-precise-it-hurts-performance

TL;DR: OpenSCAD users: download a nightly build & enable fast-csg in settings for 10x faster render (YMMV). Make your models more ambitious and report issues / successes here!

Note: opinions expressed here are my own.

  • TOC {:toc}

OpenSCAD = 3D CAD for developers

OpenSCAD is a popular open-source design tool for 3D printing afficionados (and others). It's essentially a CAD software for programmers with a minimalist UI. A simple declarative programming language defines Constructive Solid Geometry (CSG) operations like unions, intersections, differences, which can be parameterized with loops and variables.

image

Last year I grew fond of it to design ever more complex models, but started running into limitations: while interactive rendering was fine, the final rendering (creating the STL files to give to the 3D printing slicer) was horrendously slow. Slow enough to prevent me from doing what I wanted, which was to generate large, varying scalemail patterns:

a very slow model to render

But something was off: my slicer software (Cura) was able to handle 10x10 grids of my scalemail pattern without breaking a sweat, so why was OpenSCAD so slow? So after investigating alternatives (wasn't keen to get locked in a certain popular but proprietary CAD software with huge a learning curve and high prices after its 1 year hobbyist licenses), I dove into OpenSCAD's codebase and tried to understand what it was up to.

CGAL Nef Polyhedra + GMP = so precise it hurts (performance)

As it turns out, OpenSCAD is using the Computational Geometry Algorithms Library (CGAL) for its CSG operations. That amazing library allows for completely generic numeric types, and the type OpenSCAD is using is some voodoo magic "exact" numeric type based on the GNU Multiple Precision Arithmetic Library (GMP).

More specifically, OpenSCAD is using 3D boolean operations on Nef Polyhedra, which provides extrely robust and accurate results, two welcome properties in the 3D printing hobby, but also is extremely complex and slow to run (especially with those exact underlying number abstractions).

Just concatenating meshes, huh?

But surely I thought, if we just need to assemble a grid of my interlocking patterns without any intersection, it's just a matter of concatenating the meshes, no special operations needed.

Even for unions, many geometrical operations are required when overlaps / intersections require creating new points, new edges, so as to maintain the topological soundness (manifoldness) of solids so they can be 3D-printed later, etc.

So I started exploring the "fast-union" route (openscad#3636), in which I'd literally concatenated meshes if I could determine that they can't actually intersect with each other based on their bounding boxes. This worked great for some models, but ran into weird performance for others. Searching for good operands to union together is important for many models, e.g. in a grid of overlapping tiles you can skip a tile to do a fast-union but two consecutive tiles would need a proper union. That search neededs to be bounded (otherwise would be in quadratic time of # of operands; randomizing the operands would kind of work well enough), the bounding box operations need to be lighter than the geometry itself (e.g. not true when unioning a swarm of cubes), many other things needed to be taken into account.

I've paused this track for now but might explore it again.

CGAL's other hidden gem: corefinement functions

As some of the extremely friendly and helpful OpenSCAD project maintainers ❀️ pointed out on IRC, there was another option in stock: faster operations using the Polygon Mesh Processing CGAL package (authored by sloriot@) and its "corefinement" functions. For that I had to introduce Surface_mesh (after trying w/ less efficient Polyhedron_3) into OpenSCAD.

Corefinement works great in some cases, crashes (sometimes in unrecoverable ways) or politely fails in others (for instance it dislikes non-manifold inputs, or inputs that share edges / vertices). So I ended up creating a hybrid of Surface_mesh corefinement and Nef operations, reverting to the slow safety of Nefs when we detect it's likely unsafe to use corefinement (openscad#4087). My goals was to provide the maximum possible backwards compatibility with all the existing model files out there.

Great results already

The result: 10x faster or more for my models πŸŽ‰πŸŽ‰πŸŽ‰

And up to 100x if you remove the aforementioned safety checks with the fast-csg-trust-corefinement feature ([openscad#4101](openscad/openscad#4101, which may earn you some crashes πŸ€·β€β™‚οΈ).

You should test it yourself (grab a nightly build; also hopefully we get some official benchmark suite soon, see discussion in openscad#3931).

What about multithreading? Or skipping operations altogether? πŸ™ƒ

So I was getting there, performance-wise, but my models were still too slow to really scale up in complexity.

It was quite obvious from the start that OpenSCAD's rendering is mono-threaded, so I tried multithreading it (as others had done here and there, which I didn't know yet) but found it hard to get good speedups because of the tree structure of lots of models. Dependencies limit the amount of parallel operations that can be performed.

I then discovered that new experimental lazy-union feature (openscad#350, in nightly builds since a year) which essentially leaves the final top-level union up to the slicer to do. This can be a big deal if your model is just a top-level concatenation of very complex models, as it literally takes no time at all to (not) do that top-level union.

BUT, as soon as modules are involved, or for loops, or transforms, etc, lazy-union's benefits are out of reach and OpenSCAD does perform actual unions, which may be 100x faster with fast-csg, but still not fast enough to scale my scalemail to 100x100 grids :-D

Rewriting trees to increase laziness and parallelizability

So the other promising approach is rewriting the CSG tree to something that provides more lazy-unioniable top-level operands, and also increases the arity of each individual CGS operation, especially those that can be parallelized (e.g. union and intersection).

union() {
  a();
  translate([0, 0, 1])
    union() {
      color("red") square(1);
      color("green") sphere(1);
      translate([1, 0, 0]) sphere(1);
    }
}

The result of flattening this tree is to have a top-level union that can be skipped w/ lazy-unions:

a();
color("red") translate([0, 0, 1]) square(1);
color("green") translate([0, 0, 1]) sphere(1);
translate([1, 0, 1]) sphere(1);

So there you go: rewrite-tree (incubated in this branch) completely drops the amount of time it takes to render this from 1min to 400ms (rewrite-tree + lazy-union). It would have taken 2sec with fast-csg. You can even go to N=10 (100 spheres) in 4 seconds, which wouldn't complete in reasonable time without the new options (you'd then typically reduce $fn=50 or use other tricks).

$fn=50;
for (i=[0:N-1]) translate([i,0,0])
  for (j=[0:N-1]) translate([0,j,0])
    sphere(d=1.1);

What's next?

Please download a nightly build and (stress) test these features (enable in the settings, or in CLI with --enable=fast-csg), file bugs (or simply report any success on the umbrella PR). That's how open-source software gets better πŸ™.

Roadmap to making OpenSCAD lightning fast:

  • Introduce corefinement CSG operations instead of Nef polyhedra (fast-csg feature in Nightly builds, openscad#4087; initial discussions in openscad#3641)
  • Trust corefinement to handle more cases (fast-cst-trust-corefinement feature, openscad#4101). Likely to get more crashes here.
  • Investigate non-lazy CGAL kernels. Currently need fast-csg-exact feature to avoid some extreme cases.
  • Experiment with CGAL's upcoming simplification routines (cgal#5461). Corefinement currently produces more triangles than Nefs, which can make some models larger and even slower.
  • Enable fast-csg (and maybe fast-csg-trust-corefinement) by default in the next OpenSCAD release unless blockers are found.
  • Introduce rewrite-tree feature to vastly expand cases covered by lazy-union.
  • Ensure unions and intersections are parallelized to some level, maybe using one of the existing experiments.

Thanks!

Huge thanks to the dedicated team of OpenSCAD maintainers for their endless patience & support (special shoutouts to t-paul, thehans, redlizard, rcolyer and MichaelAtOz) and CGAL corefinement guru sloriot for providing the magic (and prompt support) to make the magic happen.

Oh and thank you for reading!

Please add any comments to this gist (there are also comment threads on reddit, hacker news and on the openscad mailing-list).

Follow ochafik@ on Twitter to follow my next experiments!

@johndroper
Copy link

You are my hero! I love OpenScad but the perf is sometimes hard to deal with. Thank you so much for all your hard work!

@sloriot
Copy link

sloriot commented Feb 10, 2022

Congrats for the update ! I'm glad corefinement is eventually used in OpenSCAD. About "crashes (sometimes in unrecoverable ways)", that should not happen. If you know of such cases, please open an issue and I'll have a look. Thanks

@ochafik
Copy link
Author

ochafik commented Feb 11, 2022

@sloriot thank you for your brilliant work!

I haven't run into the same crashes as last year tbh, starting to wonder how many pre-checks are still needed before calling corefinement functions (they seem fine with solids that have lots of vertices in common now?). I'll widen my tests and get back to you in asap (and will edit the post if there are no reproductible crashes πŸ˜…).

@sloriot
Copy link

sloriot commented Feb 14, 2022

To make it simple, precondition are that input are closed, triangulated and free from self-intersections. If the output is not manifold, false is returned.

@GekoPrime
Copy link

GekoPrime commented Sep 19, 2022

This is amazing, I saw render times drop from 2:45.xxx to 0:01.357 on a simple part that applied CSG to a couple screw thread polyhedra. Thank you!

@jberezin
Copy link

jberezin commented Oct 5, 2023

Thanks for this. It is way faster now.

@libelle
Copy link

libelle commented Oct 21, 2023

I'm able to do some very complex, silly renders now that would have been impossible before. Just did a very complex render with 15,062,304 facets in 0:30:08.946 on a Mac Book / Apple M1 Max.

@w3ace
Copy link

w3ace commented Nov 1, 2023

I have a lot to say about how amazing this is. Thank you so much for your contribution. I will post more about my project and how what was painstaking to build 'blocks' has become quick and manageable with fast-csg.

@nrnrnr
Copy link

nrnrnr commented Jan 31, 2024

Fantastic. My models are now rendering over 30x faster. From 150seconds down to 4 seconds. What a lifesaver!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment