Skip to content

Instantly share code, notes, and snippets.

@floooh
Last active February 9, 2022 12:26
Show Gist options
  • Save floooh/10388a0afbe08fce9e617d8aefa7d302 to your computer and use it in GitHub Desktop.
Save floooh/10388a0afbe08fce9e617d8aefa7d302 to your computer and use it in GitHub Desktop.
How I update vertex/index buffers for Dear Imgui

My Dear Imgui render loop looks a bit unusual because I want to reduce calls to WebGL as much as possible, especially buffer update calls.

This means:

  • only one buffer each for all per-frame vertex- and index-data
  • only one update call each per frame for vertex- and index-data (with my own double-buffering, since buffer-orphaning doesn't work on WebGL, and with this I'm also independent from any 'under-the-hood' magic a GL driver might or might not perform)
  • buffers and vertex attributes are 'bound' once before the draw loop
  • the draw loop only changes the texture (if necessary), sets the scissor rect and performs draw calls

ImGui gives me 'local' vertex- and index-data for each ImDrawList though, so I need to merge the vertex-data first and merge and 'rebase' the index data (so that all indices are relative to the per-frame merged vertex data), this is necessary because WebGL/GLES2 cannot render from a base-vertex-index, and I don't want to redefine the vertex-attributes during the draw loop).

WebGL also doesn't have glMapBuffer...

So updating the vertex- and index-data involves 2 copies:

  1. copy all 'local' vertex and index data chunks from ImDrawLists into two continuous memory chunks, and while at it, 'rebase' the indices to the global vertex data chunk
  2. update the vertex-buffer and index-buffers with 2 calls to glBufferSubData()

This also means only 64k vertices can be rendered per frame when using 16-bit indices, but I didn't hit that limit yet.

Here's what the code with sokol-gfx looks like (this sample doesn't have support for custom textures or custom fonts):

https://github.com/floooh/sokol-samples/blob/a762d0898a10a16fb61558b2224bc359263b26c7/glfw/imgui-glfw.cc#L238

  • sg_update_buffer() is just glBufferSubData() but does an internal buffer rotation / doube buffering
  • sg_apply_draw_state() binds the buffers and calls glVertexAttribPointer (and also updates the other GL render states)
  • sg_draw() does the glDrawElements() (...or glDrawArrays, or the instanced variants)

If I'd have a few feature wishes granted, they would be this (all features would have to be enabled during initialization):

  • optional feature to have ImGui write all vertex- and index-data into 2 continuous memory chunks instead of having local vertex- and index-data per ImDrawList (an ImDrawList would only have offsets and size of the vertex- and index-range for this draw list, in case this is needed for rendering)
  • optionally enable a 'global indices' mode, where indices are based not to the current ImDrawList but to the global vertex data
  • don't hardwire the ImDrawIdx type to 16-bit, but allow to configure ImGui during init to write 32-bit indices (not critical yet for me though, since I haven't hit the 16-bit limit)

I don't know if it is trivial to write per-frame vertex- and index-data instead of per-ImDrawList though :)

But those requests are definitely not high-prio, even with all the CPU work happening before the draw loop, ImGui rendering performance is never a problem in asm.js/wasm, no matter what I throw at it.

Cheers! -Floh.

@ocornut
Copy link

ocornut commented May 7, 2018

A) and B) would go together. The Begin() submission order are obviously decorrelated from the visible z-order, but we could submit vertices in a single buffer, and copy/re-order only the indices at the end of the frame (the indices being one tenth the size of the vertices it'd be a win over your approach / one fifth with 32-bit indices).

C) ImDrawIdx type: not sure how to do that at runtime without a performance hit, the easiest approach would probably to implement most of the inner ImDrawList code/loop twice (perhaps with a template, or just hard-coding for both types may be more reasonable). It's not really difficult but overhead/maintenance wise I'm not really eager to do it without a good reason, you could just switch to 32-bit in your case?

@floooh
Copy link
Author

floooh commented May 7, 2018

Yes, (C) isn't so important, and as long as I need to rebase the indices (so not using (A) and (B)) I could extend the indices to 32-bits myself there (assuming that a single ImDrawList never exceeds 64k vertices).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment