Goals:
- support vsynced double and triple buffering
- zero buffer copies
User space execution flow:
- block on free_queue.size() > 0
- lock free_queue, remove top element, unlock
- acquire buffer handle
- map buffer into virtual memory region
- draw into buffer
- unmap buffer
- lock ready_queue, enqueue buffer, unlock
- goto 1
Kernel execution flow:
- scanout current foreground buffer
- at vsync
- if ready_queue.size() > 0, lock the queue, get first frame, render it, then lock free_queue and add current frame to it
- else go back rendering current frame
Notes on buffer mapping:
- the buffer can be mapped into user space virtual memory with any cache setting
- cache flush happens on unmap
- mapping could use a huge page, so it needs fewer TLB entries
- the buffer doesn't need to be mapped while it's owned by the kernel, since we only DMA from it