Skip to content

Instantly share code, notes, and snippets.

@thubble
Last active February 18, 2024 21:28
Show Gist options
  • Star 17 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save thubble/235806c4c64b159653de879173d24d9f to your computer and use it in GitHub Desktop.
Save thubble/235806c4c64b159653de879173d24d9f to your computer and use it in GitHub Desktop.
Patch to allow Chromium zero-copy VaapiVideoDecoder to work when va-api exports dma-buf planes as disjoint, such as with AMD Mesa
--- a/gpu/command_buffer/service/shared_image/external_vk_image_backing_factory.cc
+++ b/gpu/command_buffer/service/shared_image/external_vk_image_backing_factory.cc
@@ -257,7 +257,7 @@
return false;
}
-#if BUILDFLAG(IS_LINUX)
+#if 0
if (format.IsLegacyMultiplanar()) {
// ExternalVkImageBacking doesn't work properly with external sampler
// multi-planar formats on Linux, see https://crbug.com/1394888.
diff -ur a/gpu/vulkan/vulkan_image.h b/gpu/vulkan/vulkan_image.h
--- a/gpu/vulkan/vulkan_image.h
+++ b/gpu/vulkan/vulkan_image.h
@@ -206,6 +206,22 @@
std::vector<uint64_t> modifiers,
VkImageUsageFlags usage,
VkImageCreateFlags flags);
+ bool InitializeFromNonDisjointGpuMemoryBufferHandle(VulkanDeviceQueue* device_queue,
+ gfx::NativePixmapHandle& native_pixmap_handle,
+ const gfx::Size& size,
+ VkFormat format,
+ VkImageUsageFlags usage,
+ VkImageCreateFlags flags,
+ VkImageTiling image_tiling,
+ uint32_t queue_family_index);
+ bool InitializeFromDisjointGpuMemoryBufferHandle(VulkanDeviceQueue* device_queue,
+ gfx::NativePixmapHandle& native_pixmap_handle,
+ const gfx::Size& size,
+ VkFormat format,
+ VkImageUsageFlags usage,
+ VkImageCreateFlags flags,
+ VkImageTiling image_tiling,
+ uint32_t queue_family_index);
#endif
raw_ptr<VulkanDeviceQueue> device_queue_ = nullptr;
diff -ur a/gpu/vulkan/vulkan_image_linux.cc b/gpu/vulkan/vulkan_image_linux.cc
--- a/gpu/vulkan/vulkan_image_linux.cc
+++ b/gpu/vulkan/vulkan_image_linux.cc
@@ -22,6 +22,32 @@
} // namespace
namespace gpu {
+
+namespace {
+
+// TODO - very hacky!! There is an identical function in vulkan_memory.cc but it isn't exported.
+// Not sure if it's better to make it global or duplicate it?
+absl::optional<uint32_t> FindMemoryTypeIndexLinux(
+ VkPhysicalDevice physical_device,
+ const VkMemoryRequirements* requirements,
+ VkMemoryPropertyFlags flags) {
+ VkPhysicalDeviceMemoryProperties properties;
+ vkGetPhysicalDeviceMemoryProperties(physical_device, &properties);
+ constexpr uint32_t kMaxIndex = 31;
+ for (uint32_t i = 0; i <= kMaxIndex; i++) {
+ if (((1u << i) & requirements->memoryTypeBits) == 0) {
+ continue;
+ }
+ if ((properties.memoryTypes[i].propertyFlags & flags) != flags) {
+ continue;
+ }
+ return i;
+ }
+ NOTREACHED();
+ return absl::nullopt;
+}
+
+} // namespace
// static
std::unique_ptr<VulkanImage> VulkanImage::CreateWithExternalMemoryAndModifiers(
@@ -73,6 +99,32 @@
VK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT);
}
+ // an image is disjoint if it has multiple planes and they don't all have the same fd.
+ disjoint_planes_ = false;
+ if (native_pixmap_handle.planes.size() > 1) {
+ auto fd0 = native_pixmap_handle.planes[0].fd.get();
+ for (const auto& plane : native_pixmap_handle.planes) {
+ if (plane.fd.get() != fd0) {
+ disjoint_planes_ = true;
+ break;
+ }
+ }
+ }
+
+ return disjoint_planes_
+ ? InitializeFromDisjointGpuMemoryBufferHandle(device_queue, native_pixmap_handle, size, format, usage, flags, image_tiling, queue_family_index)
+ : InitializeFromNonDisjointGpuMemoryBufferHandle(device_queue, native_pixmap_handle, size, format, usage, flags, image_tiling, queue_family_index);
+}
+
+bool VulkanImage::InitializeFromNonDisjointGpuMemoryBufferHandle(
+ VulkanDeviceQueue* device_queue,
+ gfx::NativePixmapHandle& native_pixmap_handle,
+ const gfx::Size& size,
+ VkFormat format,
+ VkImageUsageFlags usage,
+ VkImageCreateFlags flags,
+ VkImageTiling image_tiling,
+ uint32_t queue_family_index) {
auto& scoped_fd = native_pixmap_handle.planes[0].fd;
if (!scoped_fd.is_valid()) {
DLOG(ERROR) << "GpuMemoryBufferHandle doesn't have a valid fd.";
@@ -149,6 +201,252 @@
return result;
}
+bool VulkanImage::InitializeFromDisjointGpuMemoryBufferHandle(
+ VulkanDeviceQueue* device_queue,
+ gfx::NativePixmapHandle& native_pixmap_handle,
+ const gfx::Size& size,
+ VkFormat format,
+ VkImageUsageFlags usage,
+ VkImageCreateFlags flags,
+ VkImageTiling image_tiling,
+ uint32_t queue_family_index) {
+ DCHECK(!device_queue_);
+ DCHECK(image_ == VK_NULL_HANDLE);
+ for (int i = 0; i < 3; ++i) {
+ DCHECK(!memories_[i]);
+ }
+
+ plane_count_ = native_pixmap_handle.planes.size();
+
+ // TODO - creating a disjoint image with VK_IMAGE_TILING_OPTIMAL fails on RADV, and sometimes causes a GPU reset.
+ if (image_tiling == VK_IMAGE_TILING_OPTIMAL)
+ image_tiling = VK_IMAGE_TILING_LINEAR;
+
+ // TODO - Verify format support for VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT. See DmaBufImageSiblingVkLinux::initWithFormat() in ANGLE
+ void *create_info_pnext = nullptr;
+ VkImageDrmFormatModifierExplicitCreateInfoEXT explicit_create_drm_info {};
+ uint32_t plane_index = 0;
+ if (image_tiling == VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT) {
+ explicit_create_drm_info.sType = VK_STRUCTURE_TYPE_IMAGE_DRM_FORMAT_MODIFIER_EXPLICIT_CREATE_INFO_EXT;
+ explicit_create_drm_info.pNext = nullptr;
+ explicit_create_drm_info.drmFormatModifier = native_pixmap_handle.modifier;
+
+ for (const auto& plane : native_pixmap_handle.planes) {
+ VkSubresourceLayout plane_layout = {
+ .offset = plane.offset,
+ .size = 0,
+ .rowPitch = plane.stride,
+ .arrayPitch = 0,
+ .depthPitch = 0,
+ };
+ layouts_[plane_index++] = plane_layout;
+ }
+
+ explicit_create_drm_info.drmFormatModifierPlaneCount = plane_count_;
+ explicit_create_drm_info.pPlaneLayouts = layouts_.data();
+
+ create_info_pnext = &explicit_create_drm_info;
+ }
+
+ device_queue_ = device_queue;
+ VkImageCreateFlags create_flags = flags | VK_IMAGE_CREATE_DISJOINT_BIT;
+
+ if (ycbcr_info_) {
+ ycbcr_info_->format_features |= (VK_FORMAT_FEATURE_DISJOINT_BIT | VK_FORMAT_FEATURE_SAMPLED_IMAGE_YCBCR_CONVERSION_LINEAR_FILTER_BIT);
+
+ // From ANGLE: ImageHelper::initExternal()
+ // The Vulkan spec states: If sampler is used and the VkFormat of the image is a
+ // multi-planar format, the image must have been created with
+ // VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT
+ create_flags |= VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT;
+ }
+
+ create_info_ = {
+ .sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO,
+ .pNext = create_info_pnext,
+ .flags = create_flags,
+ .imageType = VK_IMAGE_TYPE_2D,
+ .format = format,
+ .extent = {static_cast<uint32_t>(size.width()),
+ static_cast<uint32_t>(size.height()), 1},
+ .mipLevels = 1,
+ .arrayLayers = 1,
+ .samples = VK_SAMPLE_COUNT_1_BIT,
+ .tiling = image_tiling,
+ .usage = usage,
+ .sharingMode = VK_SHARING_MODE_EXCLUSIVE,
+ .queueFamilyIndexCount = 0,
+ .pQueueFamilyIndices = nullptr,
+ .initialLayout = VK_IMAGE_LAYOUT_UNDEFINED,
+ };
+
+ VkDevice vk_device = device_queue->GetVulkanDevice();
+
+ VkResult result = vkCreateImage(vk_device, &create_info_,
+ nullptr /* pAllocator */, &image_);
+ create_info_.pNext = nullptr;
+
+ if (result != VK_SUCCESS) {
+ DLOG(ERROR) << "vkCreateImage failed result:" << result;
+ device_queue_ = nullptr;
+ return false;
+ }
+
+ void *mem_fd_req_pnext = nullptr;
+ VkExportMemoryAllocateInfo export_memory_info = {
+ .sType = VK_STRUCTURE_TYPE_EXPORT_MEMORY_ALLOCATE_INFO_KHR,
+ };
+ VkExternalMemoryProperties external_memory_properties;
+ result = QueryVkExternalMemoryProperties(
+ device_queue->GetVulkanPhysicalDevice(), format, VK_IMAGE_TYPE_2D,
+ image_tiling, usage, create_flags,
+ VK_EXTERNAL_MEMORY_HANDLE_TYPE_DMA_BUF_BIT_EXT,
+ &external_memory_properties);
+ if (result == VK_SUCCESS) {
+ export_memory_info.handleTypes =
+ external_memory_properties.compatibleHandleTypes;
+
+ mem_fd_req_pnext = &export_memory_info;
+ }
+
+ // From ANGLE: ImageHelper::initExternalMemory(), but using aspect planes instead of aspect memory planes.
+ constexpr VkImageAspectFlagBits kImagePlaneAspects[3] = {
+ VK_IMAGE_ASPECT_PLANE_0_BIT,
+ VK_IMAGE_ASPECT_PLANE_1_BIT,
+ VK_IMAGE_ASPECT_PLANE_2_BIT,
+ };
+
+ VkBindImagePlaneMemoryInfo bind_plane_infos[3] {};
+ VkBindImageMemoryInfo bind_infos[3] {};
+ for (plane_index = 0; plane_index < plane_count_; ++plane_index) {
+ auto& plane = native_pixmap_handle.planes[plane_index];
+
+ auto& scoped_fd = plane.fd;
+ if (!scoped_fd.is_valid()) {
+ DLOG(ERROR) << "GpuMemoryBufferHandle doesn't have a valid fd.";
+ return false;
+ }
+
+ VkImagePlaneMemoryRequirementsInfo image_plane_info = {
+ .sType = VK_STRUCTURE_TYPE_IMAGE_PLANE_MEMORY_REQUIREMENTS_INFO,
+ .pNext = nullptr,
+ .planeAspect = kImagePlaneAspects[plane_index],
+ };
+ VkImageMemoryRequirementsInfo2 image_plane_req = {
+ .sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_REQUIREMENTS_INFO_2,
+ .pNext = &image_plane_info,
+ .image = image_,
+ };
+
+ VkMemoryDedicatedRequirements ded_req = {
+ .sType = VK_STRUCTURE_TYPE_MEMORY_DEDICATED_REQUIREMENTS,
+ };
+ VkImagePlaneMemoryRequirementsInfo image_memory_plane_info = {
+ .sType = VK_STRUCTURE_TYPE_IMAGE_PLANE_MEMORY_REQUIREMENTS_INFO,
+ .pNext = &ded_req,
+ .planeAspect = kImagePlaneAspects[plane_index],
+ };
+ VkMemoryRequirements2 image_memory_plane_req = {
+ .sType = VK_STRUCTURE_TYPE_MEMORY_REQUIREMENTS_2,
+ .pNext = &image_memory_plane_info,
+ };
+
+ VkMemoryFdPropertiesKHR image_mem_fd_properties = {
+ .sType = VK_STRUCTURE_TYPE_MEMORY_FD_PROPERTIES_KHR,
+ };
+ // TODO - we re-take ownership of the fd from Vulkan if any subsequent calls in this loop fail,
+ // but do we need to keep track of all released fd's and reclaim them if any subsequent loop
+ // iteration (or the rest of the method) fails?
+ int memory_fd = scoped_fd.release();
+ VkImportMemoryFdInfoKHR mem_fd_req = {
+ .sType = VK_STRUCTURE_TYPE_IMPORT_MEMORY_FD_INFO_KHR,
+ .pNext = mem_fd_req_pnext,
+ .handleType = VK_EXTERNAL_MEMORY_HANDLE_TYPE_DMA_BUF_BIT_EXT,
+ .fd = memory_fd,
+ };
+ VkMemoryDedicatedAllocateInfo ded_alloc = {
+ .sType = VK_STRUCTURE_TYPE_MEMORY_DEDICATED_ALLOCATE_INFO,
+ .pNext = &mem_fd_req,
+ .image = image_plane_req.image,
+ };
+
+ result = vkGetMemoryFdPropertiesKHR(vk_device, VK_EXTERNAL_MEMORY_HANDLE_TYPE_DMA_BUF_BIT_EXT, mem_fd_req.fd, &image_mem_fd_properties);
+ if (result != VK_SUCCESS) {
+ DLOG(ERROR) << "vkGetMemoryFdPropertiesKHR failed result:" << result;
+ scoped_fd.reset(memory_fd);
+ Destroy();
+ return false;
+ }
+
+ vkGetImageMemoryRequirements2(vk_device, &image_plane_req, &image_memory_plane_req);
+
+ /* Only a single bit must be set, not a range, and it must match */
+ image_memory_plane_req.memoryRequirements.memoryTypeBits = image_mem_fd_properties.memoryTypeBits;
+
+ auto index =
+ FindMemoryTypeIndexLinux(device_queue->GetVulkanPhysicalDevice(), &image_memory_plane_req.memoryRequirements,
+ VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT);
+
+ if (!index) {
+ DLOG(ERROR) << "Cannot find validate memory type index.";
+ scoped_fd.reset(memory_fd);
+ Destroy();
+ return false;
+ }
+
+ auto memory_type_index = index.value();
+ VkDeviceSize device_size = image_memory_plane_req.memoryRequirements.size;
+
+ VkMemoryAllocateInfo alloc_info = {
+ .sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO,
+ .pNext = (ded_req.prefersDedicatedAllocation || ded_req.requiresDedicatedAllocation) ? &ded_alloc : ded_alloc.pNext,
+ .allocationSize = device_size,
+ .memoryTypeIndex = memory_type_index,
+ };
+
+ VkDeviceMemory device_memory = VK_NULL_HANDLE;
+
+ result = vkAllocateMemory(vk_device, &alloc_info, NULL, &device_memory);
+ if (result != VK_SUCCESS) {
+ DLOG(ERROR) << "vkAllocateMemory failed result:" << result;
+ scoped_fd.reset(memory_fd);
+ Destroy();
+ return false;
+ }
+
+ memories_[plane_index] = VulkanMemory::Create(device_queue, device_memory, device_size, memory_type_index);
+
+ VkBindImagePlaneMemoryInfo plane_memory_info = {
+ .sType = VK_STRUCTURE_TYPE_BIND_IMAGE_PLANE_MEMORY_INFO,
+ .pNext = nullptr,
+ .planeAspect = kImagePlaneAspects[plane_index],
+ };
+ bind_plane_infos[plane_index] = plane_memory_info;
+
+ VkBindImageMemoryInfo memory_info = {
+ .sType = VK_STRUCTURE_TYPE_BIND_IMAGE_MEMORY_INFO,
+ .pNext = &bind_plane_infos[plane_index],
+ .image = image_,
+ .memory = device_memory,
+ // For VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT, we specified the plane offsets when creating the image.
+ .memoryOffset = image_tiling == VK_IMAGE_TILING_DRM_FORMAT_MODIFIER_EXT ? 0 : plane.offset,
+ };
+ bind_infos[plane_index] = memory_info;
+ }
+
+ result = vkBindImageMemory2(vk_device, plane_count_, bind_infos);
+ if (result != VK_SUCCESS) {
+ DLOG(ERROR) << "Failed to bind memory to external VkImage: " << result;
+ Destroy();
+ return false;
+ }
+
+ // TODO - do we need to get the subresource layouts for VK_IMAGE_TILING_LINEAR,
+ // like at the end of VulkanImage::Initialize()?
+
+ return true;
+}
+
bool VulkanImage::InitializeWithExternalMemoryAndModifiers(
VulkanDeviceQueue* device_queue,
const gfx::Size& size,
diff -ur a/media/gpu/vaapi/vaapi_wrapper.cc b/media/gpu/vaapi/vaapi_wrapper.cc
--- a/media/gpu/vaapi/vaapi_wrapper.cc
+++ b/media/gpu/vaapi/vaapi_wrapper.cc
@@ -2615,14 +2615,19 @@
nullptr);
}
- // We only support one bo containing all the planes. The fd should be owned by
- // us: per va/va.h, "the exported handles are owned by the caller."
- //
- // TODO(crbug.com/974438): support multiple buffer objects so that this can
- // work in AMD.
- CHECK_EQ(descriptor.num_objects, 1u)
- << "Only surface descriptors with one bo are supported";
+ // an image is disjoint if it has multiple buffer objects and they don't all have the same fd.
+ auto fd0 = descriptor.objects[0].fd;
+ bool isDisjoint = false;
+ if (descriptor.num_objects > 1u) {
+ for (auto i = 1u; i < descriptor.num_objects; ++i) {
+ if (descriptor.objects[i].fd != fd0) {
+ isDisjoint = true;
+ break;
+ }
+ }
+ }
- base::ScopedFD bo_fd(descriptor.objects[0].fd);
+
+ base::ScopedFD bo_fd(fd0);
const uint64_t bo_modifier = descriptor.objects[0].drm_format_modifier;
// Translate the pixel format to a gfx::BufferFormat.
@@ -2656,9 +2661,22 @@
// specified, each layer should contain one plane.
DCHECK_EQ(1u, descriptor.layers[layer].num_planes);
- auto plane_fd = base::ScopedFD(
+ auto plane_raw_fd = fd0;
+ if (isDisjoint) {
+ // HACK - for disjoint (multi-object) descriptors, the first layer might not use the first buffer object.
+ // So we free it in the first loop iteration regardless.
+ // Could we maybe avoid taking the fd before the loop in the first place?
+ if (layer == 0)
+ bo_fd.release();
+ auto obj_index = descriptor.layers[layer].object_index[0];
+ plane_raw_fd = descriptor.objects[obj_index].fd;
+ }
+ else {
+ plane_raw_fd =
layer == 0 ? bo_fd.release()
- : HANDLE_EINTR(dup(handle.planes[0].fd.get())));
+ : HANDLE_EINTR(dup(handle.planes[0].fd.get()));
+ }
+ auto plane_fd = base::ScopedFD(plane_raw_fd);
PCHECK(plane_fd.is_valid());
constexpr uint64_t kZeroSizeToPreventMapping = 0u;
handle.planes.emplace_back(
@@ -2679,6 +2697,7 @@
exported_pixmap->va_surface_resolution =
gfx::Size(base::checked_cast<int>(descriptor.width),
base::checked_cast<int>(descriptor.height));
+ // TODO: This relies on there being only one bo, but it seems to be only used by image decode?
exported_pixmap->byte_size =
base::strict_cast<size_t>(descriptor.objects[0].size);
if (!gfx::Rect(exported_pixmap->va_surface_resolution)
diff --git a/third_party/skia/src/gpu/ganesh/vk/GrVkCaps.cpp b/third_party/skia/src/gpu/ganesh/vk/GrVkCaps.cpp
--- a/third_party/skia/src/gpu/ganesh/vk/GrVkCaps.cpp
+++ b/third_party/skia/src/gpu/ganesh/vk/GrVkCaps.cpp
@@ -1428,11 +1428,11 @@ static bool backend_format_is_external(const GrBackendFormat& format) {
SkASSERT(ycbcrInfo);
// All external formats have a valid ycbcrInfo used for sampling and a non zero external format.
- if (ycbcrInfo->isValid() && ycbcrInfo->fExternalFormat != 0) {
+ if (ycbcrInfo->isValid() /*&& ycbcrInfo->fExternalFormat != 0*/) {
#ifdef SK_DEBUG
VkFormat vkFormat;
SkAssertResult(format.asVkFormat(&vkFormat));
- SkASSERT(vkFormat == VK_FORMAT_UNDEFINED);
+ //SkASSERT(vkFormat == VK_FORMAT_UNDEFINED);
#endif
return true;
}
@@ -1654,12 +1654,16 @@ bool GrVkCaps::onAreColorTypeAndFormatCompatible(GrColorType ct,
if (ycbcrInfo->isValid() && !skgpu::VkFormatNeedsYcbcrSampler(vkFormat)) {
// Format may be undefined for external images, which are required to have YCbCr conversion.
- if (VK_FORMAT_UNDEFINED == vkFormat && ycbcrInfo->fExternalFormat != 0) {
+ if (VK_FORMAT_UNDEFINED == vkFormat /*&& ycbcrInfo->fExternalFormat != 0*/) {
return true;
}
return false;
}
+ // HACK!!! Skia supports kRGB_888x as long as YCbCr conversion is specified.
+ if (ycbcrInfo->isValid() && ct == GrColorType::kRGB_888x)
+ return true;
+
const auto& info = this->getFormatInfo(vkFormat);
for (int i = 0; i < info.fColorTypeInfoCount; ++i) {
if (info.fColorTypeInfos[i].fColorType == ct) {
@@ -1731,7 +1735,7 @@ skgpu::Swizzle GrVkCaps::onGetReadSwizzle(const GrBackendFormat& format,
SkAssertResult(format.asVkFormat(&vkFormat));
const auto* ycbcrInfo = format.getVkYcbcrConversionInfo();
SkASSERT(ycbcrInfo);
- if (ycbcrInfo->isValid() && ycbcrInfo->fExternalFormat != 0) {
+ if (ycbcrInfo->isValid() /*&& ycbcrInfo->fExternalFormat != 0*/) {
// We allow these to work with any color type and never swizzle. See
// onAreColorTypeAndFormatCompatible.
return skgpu::Swizzle{"rgba"};
diff --git a/third_party/skia/src/gpu/ganesh/vk/GrVkGpu.cpp b/third_party/skia/src/gpu/ganesh/vk/GrVkGpu.cpp
--- a/third_party/skia/src/gpu/ganesh/vk/GrVkGpu.cpp
+++ b/third_party/skia/src/gpu/ganesh/vk/GrVkGpu.cpp
@@ -1329,7 +1329,7 @@ static bool check_tex_image_info(const GrVkCaps& caps, const GrVkImageInfo& info
return false;
}
- if (info.fYcbcrConversionInfo.isValid() && info.fYcbcrConversionInfo.fExternalFormat != 0) {
+ if (info.fYcbcrConversionInfo.isValid() /*&& info.fYcbcrConversionInfo.fExternalFormat != 0*/) {
return true;
}
if (info.fImageTiling == VK_IMAGE_TILING_OPTIMAL) {
@thubble
Copy link
Author

thubble commented Sep 21, 2023

Updated for Chromium 117.

@darkbasic
Copy link

Despite using the patch with the following config

--ozone-platform-hint=auto
--ignore-gpu-blocklist
--enable-zero-copy
--disable-gpu-driver-bug-workarounds
--enable-features=CanvasOopRasterization,RawDraw,VaapiVideoDecodeLinuxGL

I get this error:

Cannot select DecryptingVideoDecoder for video decoding
VideoDecoderPipeline |decoder_| Initialize() successful
Selected VaapiVideoDecoder for video decoding, config: codec: vp9, profile: vp9 profile0, level: not available, alpha_mode: is_opaque, coded size: [1920,1080], visible rect: [0,0,1920,1080], natural size: [1920,1080], has extra data: false, encryption scheme: Unencrypted, rotation: 0°, flipped: 0, color space: {primaries:BT709, transfer:BT709, matrix:BT709, range:LIMITED}
VaapiVideoDecoder: failed Initialize()ing the frame pool
video decoder fallback after initial decode error.

If I add Vulkan to --enable-features= or if I use VaapiVideoDecode instead of VaapiVideoDecodeLinuxGL I don't get the error but I get Cannot select VaapiVideoDecoder for video decoding. Either way it doesn't work. AMD Phoenix APU.

@PF4Public
Copy link

E.g. I'm using Gentoo and keep this patch in /etc/portage/patches/www-client/ungoogled-chromium.

Wow, looks like I'm late to the party. @thubble You could've pinged me or something to integrate your patch into ungoogled-chromium for Gentoo! :D

Is it Wayland specific or would it work under X11 as well?

@darkbasic
Copy link

darkbasic commented Oct 3, 2023

@PF4Public I don't remember if you already cherry-picked it (but I guess not since you use X11) but you also need an ozone-vaapi patch to use it under wayland: https://aur.archlinux.org/packages/chromium-wayland-vaapi
Should be worth shipping that one as well, it used to apply cleanly for many releases before I had to rebase it.

@thubble
Copy link
Author

thubble commented Oct 3, 2023

Despite using the patch with the following config

--ozone-platform-hint=auto
--ignore-gpu-blocklist
--enable-zero-copy
--disable-gpu-driver-bug-workarounds
--enable-features=CanvasOopRasterization,RawDraw,VaapiVideoDecodeLinuxGL

I get this error:

Cannot select DecryptingVideoDecoder for video decoding
VideoDecoderPipeline |decoder_| Initialize() successful
Selected VaapiVideoDecoder for video decoding, config: codec: vp9, profile: vp9 profile0, level: not available, alpha_mode: is_opaque, coded size: [1920,1080], visible rect: [0,0,1920,1080], natural size: [1920,1080], has extra data: false, encryption scheme: Unencrypted, rotation: 0°, flipped: 0, color space: {primaries:BT709, transfer:BT709, matrix:BT709, range:LIMITED}
VaapiVideoDecoder: failed Initialize()ing the frame pool
video decoder fallback after initial decode error.

If I add Vulkan to --enable-features= or if I use VaapiVideoDecode instead of VaapiVideoDecodeLinuxGL I don't get the error but I get Cannot select VaapiVideoDecoder for video decoding. Either way it doesn't work. AMD Phoenix APU.

VaapiVideoDecodeLinuxGL probably won't work, as I don't think the GL output path handles disjoint planes at all - I only updated the Vulkan path. I haven't actually tested the GL path though.

The Vulkan error is probably due to missing --enable-features=VaapiIgnoreDriverChecks. You need that to disable the driver check (Chromium disallows VAAPI for all non-Intel drivers by default). You also might need some other flags like forcing the Vulkan backend for ANGLE. These are the flags I use, but I'm not sure which are actually necessary since I haven't tested them individually in quite a while: --ozone-platform=x11 --enable-features=CanvasOopRasterization,VaapiVideoDecoder,UseChromeOSDirectVideoDecoder,VaapiIgnoreDriverChecks,PlatformHEVCDecoderSupport,Vulkan,DefaultANGLEVulkan,VulkanFromANGLE --use-cmd-decoder=passthrough --use-gl=angle --use-angle=vulkan --use-vulkan=native --ignore-gpu-blocklist --enable-zero-copy --enable-gpu-rasterization --enable-native-gpu-memory-buffers --enable-gpu-memory-buffer-video-frames

@thubble
Copy link
Author

thubble commented Oct 3, 2023

E.g. I'm using Gentoo and keep this patch in /etc/portage/patches/www-client/ungoogled-chromium.

Wow, looks like I'm late to the party. @thubble You could've pinged me or something to integrate your patch into ungoogled-chromium for Gentoo! :D

Is it Wayland specific or would it work under X11 as well?

Awesome, thanks! I use your ebuild but wasn't sure if this patch was too specific. I've just been posting it in the arch forums since that seems to be the most active discussion. If you could include it that'd be perfect. (Note, I'm not sure when 118 is being released or when I'll have time to rebase it. I'll be away from my main desktop for the next few weeks so probably won't be able to do it. The difficulty of the rebase obviously depends on how many changes there are, but there are an increasing number of Chromium changes going in for multiplanar support - though nothing that will officially fix AMD yet unfortunately).

This patch actually only works on X11 (or XWayland). VAAPI with Vulkan output doesn't work on native Wayland as far as I know (even on Intel), since Vulkan surfaces aren't implemented for the Wayland Ozone backend yet. If someone could update the VaapiVideoDecodeLinuxGL (GL output) code to handle disjoint planes the same way as this patch does for Vulkan output, that might work (I assume the patch that @darkbasic mentioned would also be necessary).

@darkbasic
Copy link

darkbasic commented Oct 4, 2023

Vulkan surfaces aren't implemented for the Wayland Ozone backend

Oh, that explains why that didn't work: I'm using ozone wayland.

If someone could update the VaapiVideoDecodeLinuxGL (GL output) code to handle disjoint planes the same way as this patch does for Vulkan output, that might work

Do you know if this (which will be available in 119) will also take care of VaapiVideoDecodeLinuxGL (GL output)?

@thubble
Copy link
Author

thubble commented Oct 4, 2023

If someone could update the VaapiVideoDecodeLinuxGL (GL output) code to handle disjoint planes the same way as this patch does for Vulkan output, that might work

Do you know if this (which will be available in 119) will also take care of VaapiVideoDecodeLinuxGL (GL output)?

Based on the latest replies I'd say probably not. There was some extensive refactoring of how multiplanar images are handled, but according to the developers' responses it seems to be just memory allocation changes, nothing to do with any output paths.

To support Vulkan output I had to change both the Vulkan image store implementation and the Skia library which outputs it. Fortunately there was already disjoint image support for Skia/Vulkan but only enabled for Android, so I just needed to comment out the Android-specific checks. For GL, I'm not even totally sure how which shared image implementation it uses, or what support exists for disjoint images in Skia.

@darkbasic
Copy link

I wonder if it would be easier to update the VaapiVideoDecodeLinuxGL (GL output) code to handle disjoint planes or implementing Vulkan surfaces for the Wayland Ozone backend.
Either way I don't see myself being able to use of any kind of acceleration for the foreseeable future :(

@PF4Public
Copy link

@thubble Would applying this patch unconditionally have any side-effects? Somehow affect normal vaapi users with Intel hardware?

@thubble
Copy link
Author

thubble commented Oct 7, 2023

@thubble Would applying this patch unconditionally have any side-effects? Somehow affect normal vaapi users with Intel hardware?

I don't believe so - it should properly detect whether an exported image from vaapi is disjoint and use the appropriate path - but I don't have an Intel GPU to test with. I'm not aware of anyone testing this with an Intel device, successfully or otherwise.

@hannesmann
Copy link

Thank you for this patch! Since the root cause of the problem is vaExportSurfaceHandle returning more than one descriptor, I was curious if it was possible to solve this in Mesa. Chromium takes a long time to compile and there are some proprietary applications (Electron, etc) that are impossible to patch. As it turns out, at least for AMD, it's fairly easy to modify vaExportSurfaceHandle to return only one descriptor.

The radeonsi and d3d12 drivers set a flag called PIPE_VIDEO_CAP_SUPPORTS_CONTIGUOUS_PLANES_MAP which indicates that all planes are laid out contiguously in memory. Mesa's vaExportSurfaceHandle implementation will always return a separate descriptor for every plane but if we know that all planes share the same buffer, it can be modified to only export it once and define each plane as an offset into the buffer.

The patch I've developed is here. If the patch is applied together with !23214 VA-API decoding works on Chromium 117 without any modifications. VaapiVideoDecodeLinuxGL still doesn't work for whatever reason so Vulkan is required.

I've tested it on an RX 6700 XT and Ryzen 5 5500U. It hasn't caused any regressions for me in other applications, so it could probably be upstreamed (at least behind an environment variable) once it has been tested more. It shouldn't cause any regressions for Intel or NVIDIA users, but it won't make VaapiVideoDecoder work because radeonsi is the only Gallium driver on Linux that sets PIPE_VIDEO_CAP_SUPPORTS_CONTIGUOUS_PLANES_MAP.

@darkbasic
Copy link

VaapiVideoDecodeLinuxGL still doesn't work for whatever reason so Vulkan is required.

Ouch, I was about to start recompiling mesa but apparently there is no hope for ozone wayland users.
Thanks for your work anyway, it's much appreciated.

@ZakariaBouzouf
Copy link

Thank you for this patch, unfortunately i couldn't build chromium, it failed everytime during the build, maybe i do it wrong because i'm not an expert. Can someone please exactly how and which version of chromium i have to use , and if someone have an already builded package, it will be nice if he share it .

@kerriganx
Copy link

kerriganx commented Oct 11, 2023

Thank you for this patch, unfortunately i couldn't build chromium, it failed everytime during the build, maybe i do it wrong because i'm not an expert. Can someone please exactly how and which version of chromium i have to use , and if someone have an already builded package, it will be nice if he share it .

I guess because this patch hasn't been updated to Chromium 118 yet.
I suggest patch Mesa instead, if you use Arch there is good instruction here: https://bbs.archlinux.org/viewtopic.php?pid=2125225#p2125225
Chromium takes a long time to compile and there are some proprietary applications (Electron, etc) that are impossible to patch.

@ZakariaBouzouf
Copy link

Thank you for this patch, unfortunately i couldn't build chromium, it failed everytime during the build, maybe i do it wrong because i'm not an expert. Can someone please exactly how and which version of chromium i have to use , and if someone have an already builded package, it will be nice if he share it .

I guess because this patch hasn't been updated to Chromium 118 yet. I suggest patch Mesa instead, if you use Arch there is good instruction here: https://bbs.archlinux.org/viewtopic.php?pid=2125225#p2125225 Chromium takes a long time to compile and there are some proprietary applications (Electron, etc) that are impossible to patch.

Oh thank you i managed to make HW acceleration works on brave and chromium , but it seems that it's only make it HWA show up, and now actually work because the cpu goes up ( maybe little bit lower than before) but i still can hear the fans, but when i use the mpv plugins even on a high quality the cpu is much lower and i don't hear the fans.
Btw i have a Lenovo Yoga 7 14 (14ARB7) with AMD Ryzen 6800U with a AMD Radeon 680M.

@kerriganx
Copy link

You can check that gpu is actually used on this page: chrome://media-internals/
It should show kVideoDecoderName: "VaapiVideoDecoder"

@ZakariaBouzouf
Copy link

You can check that gpu is actually used on this page: chrome://media-internals/ It should show kVideoDecoderName: "VaapiVideoDecoder"

Yes it's. Like i said it's working but not as good as with mpv.

@thubble
Copy link
Author

thubble commented Oct 20, 2023

Thank you for this patch! Since the root cause of the problem is vaExportSurfaceHandle returning more than one descriptor, I was curious if it was possible to solve this in Mesa. Chromium takes a long time to compile and there are some proprietary applications (Electron, etc) that are impossible to patch. As it turns out, at least for AMD, it's fairly easy to modify vaExportSurfaceHandle to return only one descriptor.

The radeonsi and d3d12 drivers set a flag called PIPE_VIDEO_CAP_SUPPORTS_CONTIGUOUS_PLANES_MAP which indicates that all planes are laid out contiguously in memory. Mesa's vaExportSurfaceHandle implementation will always return a separate descriptor for every plane but if we know that all planes share the same buffer, it can be modified to only export it once and define each plane as an offset into the buffer.

The patch I've developed is here. If the patch is applied together with !23214 VA-API decoding works on Chromium 117 without any modifications. VaapiVideoDecodeLinuxGL still doesn't work for whatever reason so Vulkan is required.

I've tested it on an RX 6700 XT and Ryzen 5 5500U. It hasn't caused any regressions for me in other applications, so it could probably be upstreamed (at least behind an environment variable) once it has been tested more. It shouldn't cause any regressions for Intel or NVIDIA users, but it won't make VaapiVideoDecoder work because radeonsi is the only Gallium driver on Linux that sets PIPE_VIDEO_CAP_SUPPORTS_CONTIGUOUS_PLANES_MAP.

@hannesmann Thanks, this works perfectly on Chromium 118! Way better solution than constantly updating my Chromium patch (I wasn't looking forward to 118). I had no idea that this was supported by the radeonsi driver but not implemented in vaapi - maybe because it's not supported by any of the other drivers?

I'm not sure why GL wouldn't be working, since the output format should be no different than Intel's vaapi drivers. I might test it at some point if I have time.

@darkbasic
Copy link

I'm not sure why GL wouldn't be working, since the output format should be no different than Intel's vaapi drivers

If there is any chance to get it working I will give it a try with Chromium 119. I have to recompile mesa anyway to test some patches for rusticl.

@darkbasic
Copy link

I confirm it works, but only without --ozone-platform-hint=auto.

Also I get tons of [15366:15366:1028/123710.850045:ERROR:vulkan_swap_chain.cc(409)] : Swapchain is suboptimal. in the logs.

A simple way to check if it's working is looking for VCN: Enabled in sudo watch cat /sys/kernel/debug/dri/0/amdgpu_pm_info.

Also you can press CTRL+Shift+I in Chromium, click on the three dots in the top right corner -> more tools -> media -> select the player and look for Decoder name: Vaapi Video Decoder and Hardware decoder: true.

but it seems that it's only make it HWA show up, and now actually work because the cpu goes up ( maybe little bit lower than before) but i still can hear the fans, but when i use the mpv plugins even on a high quality the cpu is much lower and i don't hear the fans.

It's working but it still uses a decent chunk of CPU and GPU power, making the fans spin. Not sure why mpv is way more efficient.

@ZakariaBouzouf
Copy link

ZakariaBouzouf commented Oct 28, 2023

I confirm it works, but only without --ozone-platform-hint=auto.

Also I get tons of [15366:15366:1028/123710.850045:ERROR:vulkan_swap_chain.cc(409)] : Swapchain is suboptimal. in the logs.

A simple way to check if it's working is looking for VCN: Enabled in sudo watch cat /sys/kernel/debug/dri/0/amdgpu_pm_info.

Also you can press CTRL+Shift+I in Chromium, click on the three dots in the top right corner -> more tools -> media -> select the player and look for Decoder name: Vaapi Video Decoder and Hardware decoder: true.

but it seems that it's only make it HWA show up, and now actually work because the cpu goes up ( maybe little bit lower than before) but i still can hear the fans, but when i use the mpv plugins even on a high quality the cpu is much lower and i don't hear the fans.

It's working but it still uses a decent chunk of CPU and GPU power, making the fans spin. Not sure why mpv is way more efficient.

I don't know also the reason why mpv is more efficient, but i can confirm that when using the avc1 codec instead of vp9 it's consume less power althoug the quality is less better .

@soupglasses
Copy link

I am curious if anyone has attempted to work on Encoding side for AMD with VaapiVideoEncoder? I have been very curious of the possibility to get hardware accelerated screensharing inside chromium.

@darkbasic
Copy link

I've enabled it and Chromium seems to detect the capability but I still have to test it. Hardware decoding is already buggy enough on Phoenix to not make it worth having it enabled.

@peromax1
Copy link

Hi @thubble and thank you for the patch! I am trying to build Chromium 117.0.5938.60 with your patch but it's always failing with this error:

FAILED: obj/media/gpu/vaapi/common/vaapi_wrapper.o
                                                 .
                                                 .
                                                 .
                                                 .
-fvisibility-inlines-hidden -c ../../media/gpu/vaapi/vaapi_wrapper.cc -o obj/media/gpu/vaapi/common/vaapi_wrapper.o
../../media/gpu/vaapi/vaapi_wrapper.cc:2670:9: error: ignoring return value of function declared with 'nodiscard' attribute [-Werror,-Wunused-result]
 2670 |         bo_fd.release();
      |         ^~~~~~~~~~~~~

I was following these instructions for building chromium: https://chromium.googlesource.com/chromium/src/+/main/docs/linux/build_instructions.md.

Building without the patch was successful.

Is the patch compatible only with specific 117 version different from the one I tried(117.0.5938.60) ?

@thubble
Copy link
Author

thubble commented Nov 30, 2023

@peromax1 This patch is obsolete, the preferred solution is to build Mesa from source using the patches mentioned here: https://gist.github.com/thubble/235806c4c64b159653de879173d24d9f?permalink_comment_id=4718214#gistcomment-4718214 (instructions for applying this on Arch are here: https://bbs.archlinux.org/viewtopic.php?pid=2125225#p2125225 https://bbs.archlinux.org/viewtopic.php?pid=2133444#p2133444)

Chromium 117 is out of date and considered insecure so I wouldn't recommend running it, and this patch likely won't be updated for later versions. The Mesa patch will allow this to work on all Chromium versions without having to modify your distro's Chromium package.

As for the error, it looks like -Werror,-Wunused-result or something similar was somehow added to the CFLAGS/CXXFLAGS. I'm not sure where this is coming from, the Chromium build system or your system flags, but it would need to be removed. I'm pretty sure most people are building Chromium using their distro's packaging system (Arch PKGBUILD, Gentoo ebuild) rather than directly using the build instructions. Again though, the preferred solution now is definitely to patch Mesa rather than Chromium. How you do that would depend on your distro.

EDIT: Corrected the link for the directions on how to build the Mesa package. They're for Arch; if you're on Gentoo then just put the necessary patches in /etc/portage/patches/media-libs/mesa/; unfortunately I'm not sure of the exact steps on other distros.

@darkbasic
Copy link

darkbasic commented Dec 6, 2023

If you are experiencing glitches (image freezes randomly) while decoding via vaapi the following bug report contains updated firmware for Rembrandt/Phoenix.

@darkbasic
Copy link

darkbasic commented Jan 30, 2024

I've tested this MR with Chrome 122 beta which ships ozone wayland compatible VaapiWrapper but unfortunately it's still a no go because the MR requires --use-angle=vulkan which is still not supported under ozone wayland.
vaapi on wayland has never been so close yet so far away :(

@thubble
Copy link
Author

thubble commented Feb 3, 2024

I've tested this MR with Chrome 122 beta which ships ozone wayland compatible VaapiWrapper but unfortunately it's still a no go because the MR requires --use-angle=vulkan which is still not supported under ozone wayland. vaapi on wayland has never been so close yet so far away :(

Yeah, I'm not sure why --use-angle=vulkan is still disabled on Wayland. ANGLE has supported the Vulkan backend on Wayland for quite a while.

Some comments in this recent MR (to enable it on ChromeOS's Lacros only) seem to imply that it just hasn't been tested, or maybe a particular optimization needs to be disabled: https://chromium-review.googlesource.com/c/chromium/src/+/5245195

Another potential issue is that the Ozone Wayland backend's Vulkan implementation doesn't support creating view surfaces: https://source.chromium.org/chromium/chromium/src/+/main:ui/ozone/platform/wayland/gpu/vulkan_implementation_wayland.cc;l=53. I remember when I investigated a while ago, this was required for VAAPI's Vulkan flow (since it's required in order to create a Skia Vulkan output surface). I'm not sure why this isn't implemented either, since vkCreateWaylandSurfaceKHR() has existed for quite a while. I assume there's probably a bunch of interactions between Wayland, Vulkan, Skia, ANGLE, and Ozone/Chromium that I'm not aware of (either that or Wayland just isn't a priority for the Chromium devs).

Personally, I use Chromium with XWayland anyway, mainly because I usually keep a lot of windows open, and I like having their positions restored when I restart (and Wayland purposely doesn't have a way for applications to get/set their window positions). Also it seems that Chromium just doesn't fully support Wayland yet, for example this latest issue.

@darkbasic
Copy link

Another user got this working with VaapiVideoDecodeLinuxGL which doesn't work for me. Does it work for you?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment