Skip to content

Instantly share code, notes, and snippets.

@mildsunrise
Last active April 9, 2025 12:35

Usage of virglrenderer

Introduction

virglrenderer is a library that gives emulators the necessary tools to implement a virtio-gpu device, in particular one with 3D support. See capability sets below for a summary of the APIs virglrenderer can implement for the guest. It directly implements the logic behind some 3D commands like GET_CAPSET, CTX_CREATE, CTX_SUBMIT_3D, CREATE_RESOURCE_BLOB, and though it closely follows the semantics of virtio-gpu in most cases, it is in theory independent of virtio or any other transport.

Main user is qemu, but there also appears to be a standalone virtio-gpu vhost-user implementation that uses virglrenderer in qemu/contrib.

virglrenderer's public header is at src/virglrenderer.h. This document attempts to outine the public API and contract, but you should look at the header for an authoritative source of truth since semantics described here could be slightly wrong or have changed, as the API is still evolving. A small fraction of the APIs described below are labeled unstable and subject to break; you'll need to define VIRGL_RENDERER_UNSTABLE_APIS before including virglrenderer.h to access them (alternatively, enable unstable-apis when building to have that define added to pkg-config cflags).

The public API is not thread safe and must be called from a single thread, and callbacks (with the exception of async fence callbacks, described below) will not be invoked from other threads.

Renderer lifecycle

virglrenderer uses global state, so only one instance can operate in a process. Before calling any other method, initialize the renderer by calling virgl_renderer_init with the following parameters:

  • A void * cookie that will be passed to callbacks.
  • Set of global flags, discussed throughout the document. Most allow enabling or disabling features exposed to the guest (see capability sets), others tweak the synchronization (see fencing), and others manage access to the underlying GPU resources (see GPU resource management).
  • Set of optional callbacks. Some of them allow virglrenderer to notify you of events (see fencing), and others allow you to supply resources to virglrenderer yourself, such as setting up the GL context or opening a DRM node (see GPU resource management). The struct is versioned to allow new fields to be added without breaking ABI. virglrenderer keeps a reference to this struct, so it must live until you call virgl_renderer_cleanup.

Like most functions, the return value is 0 on success and an errno code on failure. Example of use:

void *cookie = my_state;
int ret;

int virgl_flags = VIRGL_RENDERER_USE_EGL | VIRGL_RENDERER_USE_SURFACELESS;

struct virgl_renderer_callbacks *virgl_cbs = malloc(sizeof(struct virgl_renderer_callbacks));
memset(&virgl_cbs, 0, sizeof(virgl_cbs));
virgl_cbs.version = VIRGL_RENDERER_CALLBACKS_VERSION;
// TODO: set callbacks here

if (ret = virgl_renderer_init(cookie, virgl_flags, &virgl_cbs)) {
    fprintf(stderr, "failed to initialize virgl renderer: %s\n", strerror(ret));
    return 1;
}

You can now call the rest of the methods as needed. It's important to call the non-blocking method virgl_renderer_poll() periodically, in order to carry out work such as checking for completed fences (see next section).

When the virtual GPU stops operating, call virgl_renderer_cleanup(NULL) to clean up resources (its argument is unused). To reset to a clean post-initialization state (for example, when the device is reset by the guest) use virgl_renderer_reset().

Fencing

As usual for 3D APIs, virglrenderer fulfills commands asynchronously. Methods such as virgl_renderer_submit_cmd() only perform some validation, queue the command for execution and return immediately. To facilitate synchronization, virglrenderer exposes an API to create fences, i.e. requests to be notified when all GPU operations (up to the point at which the fence was created) have been carried out.

When initializing virglrenderer, supply a write_fence callback. Then to request a fence, call virgl_renderer_create_fence(fence_id, ctx_id) where fence_id is a numeric identifier you can freely choose that will be passed back to the write_fence callback to allow discerning which of the pending fences is being notified. For historic / compatibility purposes, the second argument is unused and the first argument is cast to uint32_t, which is what the callback receives:

void write_fence(void *cookie, uint32_t fence);

Fences are notified in the same order they were created. Note however that, if at the time of checking for fulfilled fences, many of them have fulfilled, the callback will only be called 1 time, with the identifier of the last of these fences in creation order.

Per-context fences

The previous API registers global fences, i.e. fences that cover all GPU commands in flight. The virgl_renderer_context_create_fence() method allows registering fences on a specific context (ctx_id), and on a specific timeline within that context (ring_idx). Fences created this way are notified through a different write_context_fence callback, which also gets passed these 2 parameters verbatim:

void write_context_fence(void *cookie, uint32_t ctx_id, uint32_t ring_idx, uint64_t fence_id);

Commands can then be submitted on a specific timeline (identified by a ctx_id / ring_idx tuple), and this allows creating fences that only wait for a subset of the in-flight commands. Fences will only be delivered in order relative to fences on the same timeline.

Being a newer API, virgl_renderer_context_create_fence() has an additional flags parameter, which currently only accepts VIRGL_RENDERER_FENCE_FLAG_MERGEABLE. This flag is implied in the API for global fences, and as explained above, it allows notifications for multiple fences (on the same timeline) to be coalesced into a single notification for the newer fence.

Thread synchronization

As explained before, virglrenderer requires virgl_renderer_poll() to be called often to check for expired fences and retire them (which involves performing some context-specific work and invoking the user callback).

Passing the VIRGL_RENDERER_THREAD_SYNC flag causes some of the work of virgl_renderer_poll(), in particular checking for fulfilled fences, to be offloaded to a set of separate (persistent) threads that virglrenderer spins when initializing. These are called "sync threads" and every enabled component (see capability sets) typically has one.

When this feature is in effect, an eventfd is also created which can be obtained by calling virgl_renderer_get_poll_fd(). Upon detecting completed fences, sync threads write to this FD to notify you that you should call virgl_renderer_poll() in the main thread to retire them. Therefore, when this feature is in effect you don't need to periodically call virgl_renderer_poll(); instead poll that FD for reading (POLLIN) and call virgl_renderer_poll() when POLLIN is asserted. virgl_renderer_poll() will deassert the event, so once it has finished you can keep polling for it.

This feature is only a hint so the emulator knows when to call virgl_renderer_poll(): passing the VIRGL_RENDERER_THREAD_SYNC flag does not guarantee it will be enabled (it could fail to enable, or the environment variable VIRGL_DISABLE_MT could've been used to force it off). If the feature isn't in effect, virgl_renderer_get_poll_fd() will return -1 and the emulator should fall back to calling virgl_renderer_poll() periodically.

Ownership of the FD returned by virgl_renderer_get_poll_fd() remains with virglrenderer, which will close it when cleaning up.

Warning: VIRGL_RENDERER_THREAD_SYNC is (at least when VirGL is enabled) currently broken when used together with virgl_renderer_reset(). Calling it when thread synchronization is active will leave virglrenderer in an inconsistent state, so if you want to use VIRGL_RENDERER_THREAD_SYNC, you can call virgl_renderer_cleanup() + virgl_renderer_init() instead of virgl_renderer_reset() as a workaround.

Async fence callbacks

Note that even when thread synchronization is in effect, virgl_renderer_poll() still has to check for notifications from sync threads and retire any expired fences. The VIRGL_RENDERER_ASYNC_FENCE_CB initialization flag (which only has effect when thread synchronization is active, as indicated by virgl_renderer_get_poll_fd()) allows callbacks to be dispatched directly from the sync threads rather than by virgl_renderer_poll() on the main thread, which in many cases allows notifications to be delivered without delay.

Note that invoking the user callback may only be part of the work involved in retiring a fence; there may be other work that needs to be carried out in the main thread, sometimes as a prerequisite to dispatching the user callback. Thus even when using VIRGL_RENDERER_ASYNC_FENCE_CB you must still call virgl_renderer_poll() when needed (as signalled by virgl_renderer_get_poll_fd()). So from the user's perspective VIRGL_RENDERER_THREAD_SYNC | VIRGL_RENDERER_ASYNC_FENCE_CB is used just like VIRGL_RENDERER_THREAD_SYNC but with the extra requirement that fence callbacks must be thread safe, as virglrenderer may invoke them from any thread (possibly concurrently).

Capability sets

A single virtio-gpu device can offer support for more than one GPU API (see VIRTIO_GPU_CMD_GET_CAPSET_INFO for officially allocated capset IDs), and through the VIRTIO_GPU_F_CONTEXT_INIT feature it allows contexts of different APIs to be created and used concurrently. Though virglrenderer was originally created to implement the VirGL interface, it now offers other APIs if support for them is compiled at build time and the appropriate flags are passed to virgl_renderer_init().

Supported capabilities to date are:

  • VirGL: Exposes OpenGL through encoding / semantics modelled around Gallium driver commands, making it easier to implement a Mesa driver for the guest (which is the VirGL Mesa driver). This is enabled by default, and can't be left out of the build, but can be disabled at runtime through the VIRGL_RENDERER_NO_VIRGL flag. It's exposed through the VIRTIO_GPU_CAPSET_VIRGL and VIRTIO_GPU_CAPSET_VIRGL2 virtio-gpu capability sets.

    • Video: virglrenderer can also expose accelerated video decoding / encoding through VA-API. This isn't a separate virtio-gpu capability set, instead it's carried as an optional part of VirGL (support for it is advertised inside the VIRGL capability blob, see below). This is currently an unstable API. Pass -Dvideo=enabled when building, and flag VIRGL_RENDERER_USE_VIDEO on initialization. See this comment for more info about the general architecture.
  • Venus: Exposes the Vulkan API. Pass -Dvenus=enabled (venus-validate is also relevant for development) when building, and the VIRGL_RENDERER_VENUS flag on initialization. It's exposed through the VIRTIO_GPU_CAPSET_VENUS virtio-gpu capability set.

  • DRM: Exposes low-level DRM operations directly. Pass -Ddrm=enabled when building, and the VIRGL_RENDERER_DRM flag on initialization. It's exposed through the VIRGL_RENDERER_CAPSET_DRM (not yet released in the virtio spec) virtio-gpu capability set. This is designed to host many platforms, but right now it only has experimental support for MSM chips (pass -Ddrm-msm-experimental=enabled).

To discover capabilities of a virtio-gpu device, the guest first asks for the capability sets it supports through a series of VIRTIO_GPU_CMD_GET_CAPSET_INFO commands. virglrenderer exposes this info to the emulator through the virgl_renderer_get_cap_set() method.

After enumerating the supported capability sets, it fetches each of them through the VIRTIO_GPU_CMD_GET_CAPSET command. The result is a blob of data (encoded in capset-specific ways) describing the precise capabilities supported by that API. virglrenderer exposes this info to the emulator through the virgl_renderer_fill_caps() method. For more details, see the virtio spec.

The capability set also defines the encoding of the opaque command stream that flows from guest to host through VIRTIO_GPU_CMD_SUBMIT_3D.

GPU resource management

VirGL has several modes of operation, which govern how it manages resources (context setup, buffer allocation, etc.):

  • Fully virglrenderer managed: This uses one of several "winsys" backends. The backend choice and its operation can be customized by both buildsystem flags and initialization flags. -Dplatforms=foo,bar allows choosing the platforms (glx, egl or auto) to obtain OpenGL access, and a specific one can be selected at runtime through the VIRGL_RENDERER_USE_EGL or VIRGL_RENDERER_USE_GLX initialization flags. Independently of the backend, if GBM is available at build time, it will be used for buffer allocation (see also the minigbm_allocation option). If EGL is chosen, VIRGL_RENDERER_USE_SURFACELESS can be passed to choose a surfaceless platform.

    • Sandboxed operation: If the process is sandboxed, the user should select EGL using the initialization flag above (to bypass discovery) and supply a get_drm_fd callback that will be invoked when virglrenderer needs to get an open file to the DRM render device node.
  • Custom backend: To bypass the provided backends and take control of the logic to manage the underlying API contexts, the user may supply the create_gl_context, destroy_gl_context, make_current (and optionally get_egl_display) callbacks.

Venus doesn't offer that control, but has a "proxy mode" where it will forward the commands to another process called the render server for its execution, to improve security. It needs no special build options, and can be enabled at runtime through the VIRGL_RENDERER_RENDER_SERVER flag. By default this makes virglrenderer spawn a render server automatically, taking care to pass it an IPC socket to communicate over. With sandboxed operation, the user can supply a get_server_fd callback that returns the FD of a socket connected to a running server. Proxy mode is currently unsupported by VirGL.

The code for the render server lives under the server directory, and supports several degrees of isolation among different contexts (see the render-server-worker option).

When not using the proxy backend, Venus and the rest of capabilities support sandboxing through the get_drm_fd callback, which is why that callback may get called multiple times.

Resource export

To export a resource, there's a variety of interfaces. First of all, you should query its information with virgl_renderer_resource_get_info[_ext]. It must be called with a zeroed struct. If the passed resource is untyped (e.g. a blob resource) then only fd will be populated. Note that the semantics of this FD are essentially opaque at this point, and it may not even be an FD but e.g. a GEM handle, see the comment on the definition of struct virgl_resource for more info. If the resource is typed (which, at least right now, effectively means "OpenGL 2D texture") the other fields will be populated, and fd may or may not hold a significant value. Of particular utility is tex_id, which is the GLuint ID of the texture. If EGL is being used, you may wrap it into an EGL image using eglCreateImage with the current context and EGL_GL_TEXTURE_2D.

If you need to export the texture into a DMA-BUF FD, for EGL you could use EGL_MESA_image_dma_buf_export on the wrapped image, and virglrenderer provides utility functions virgl_renderer_get_fd_for_texture[2] that do just that. See eglExportDMABUFImageMESA in that extension for info on parameters fd, strides and offset; in particular, they should be NULL or point to user-allocated arrays of planes items. Note that these utility functions take tex_id, not a resource handle. For non-EGL contexts these functions do not work, but the virgl_renderer_export_query request (see Execute protocol) can be used instead (this request in turn only works for GBM-allocated textures, there is an explicit comment directing users to virgl_renderer_get_fd_for_texture for EGL implementations).

To export untyped resources, use virgl_renderer_resource_export_blob. Unlike the fd returned on resource info, the FD created by this function has normalized / known semantics (returned at fd_type) and is owned by the caller.

Note: Beware of the alpha! Many systems treat formats that leave the alpha bits unused (like XB24) like their alpha-carrying equivalents (here, AB24). Even if the guest is creating an XB24 resource, exporting it from virglrenderer may give you an AB24 buffer (meaning that if the guest leaves the unused bits as zero, these will get interpreted into a transparent exported image).

Other APIs

Execute protocol

There is a generic protocol where virgl_renderer_execute is passed a buffer, which must contain at least a header (virgl_renderer_hdr) specifying version (stype_version, currently must be 0), request type (stype, one of the nonzero values in virgl_renderer_structure_type_v0) and size (must currently match the size of the passed buffer).

Each request type is paired with a concrete struct type, with in_ fields that must be populated by the caller and out_ fields that are populated by virglrenderer if virgl_renderer_execute is successful. The basic request is virgl_renderer_supported_structures, which can be used to discover the implemented request types (out_supported_structures_mask, an OR of values in virgl_renderer_structure_type_v0) for a given type version (in_stype_version).

Logging

virglrenderer used to have an API to accept log messages at a callback, virgl_set_debug_callback, which was deprecated when a better logging system (with multiple priorities) was introduced. Callback may now be set with virgl_set_log_callback, which allows passing user data and a free function. The callback now gets a formatted message and its priority. If a callback isn't set, log calls are shortcircuited to no-ops.

@cRaZy-bisCuiT
Copy link

cRaZy-bisCuiT commented Jan 3, 2024

Hi mildsunrise,
thanks a lot for this information! Do we still need those build flags for the most recent builds of VirGL, qemu, virtio or is this expected to just work? Which would be the minimum required versions if that is the case?

I'm talking about Venus and vaapi support especially.

@mildsunrise
Copy link
Author

hi, this information is aimed more at developers who want to use virglrenderer at their code than actual users, I'm afraid I can't provide much input in that front, my understanding is that qemu supplies the appropriate flags to build virglrenderer but I'm possibly wrong

@cRaZy-bisCuiT
Copy link

cRaZy-bisCuiT commented Jan 19, 2024

@mildsunrise Allright thank you! Gonna find that out myself somehow.

@invokermoon
Copy link

invokermoon commented Feb 17, 2025

This article is excellent, and I have learned a lot from it. How to understand the ring_idx with fence? Does it mean that each OpenGL context has its own dedicated fence ring_idx, and fence of different ring_idx will not block each other fence from other ring_idx?
If there are 2 OpenGLES APP in GuestVM, Does it means the fences from APP1 and APP2 are independent? They don't block each other?

@mildsunrise
Copy link
Author

@invokermoon it is even more granular; the timeline of a fence is defined by the combination of ctx_id and ring_id (that is, every context may have multiple timelines). fences on different timelines (for example, CTX1+RING1 and CTX1+RING2) do not block eachother

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment