Skip to content

Instantly share code, notes, and snippets.

@estelsmith
Last active March 27, 2024 21:51
Show Gist options
  • Star 7 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save estelsmith/4e7bf4608fed0c81ff92f5ea3d5a7056 to your computer and use it in GitHub Desktop.
Save estelsmith/4e7bf4608fed0c81ff92f5ea3d5a7056 to your computer and use it in GitHub Desktop.
Gemini Lake Hardware-accelerated Transcoding

Gemini Lake Hardware-accelerated Transcoding

This is for a used Wyse 5070 I recently purchased, having the J5005 CPU and a huge 8 GB RAM all running on AlmaLinux 9.1. Why AlmaLinux? Because I'm partial to RHEL-based systems with its focus on stability, that's all. Unfortunately the downside is that oftentimes new software isn't readily available.

I want to unlock the ability to perform hardware-accelerated transcoding in ffmpeg so I can use it in Tdarr as a decent remote transcoding node.

Intel ARK - Pentium Silver J5005 shows that the CPU has Quick Sync Video support, and as such supports some form of hardware video acceleration.

The CPU is Gemini Lake and uses Intel UHD Graphics 605, which is Gen 9.5 according to the ffmpeg wiki. The machine should be able to handle H.264 and H.265 transcoding without issue.

The ffmpeg wiki also shows that I need a /dev/dri/render* device in order to use VAAPI, which is how ffmpeg would interface with the GPU. So I need to figure out how to get /dev/dri devices on the machine.

On a side note, Tdarr 2.00.19.1 comes with the following VAAPI encoders by default, so hardware transcoding for H.264 and H.265 is definitely an option for me.

$ tdarr-ffmpeg -encoders 2> /dev/null | grep vaapi
 V....D h264_vaapi           H.264/AVC (VAAPI) (codec h264)
 V....D hevc_vaapi           H.265/HEVC (VAAPI) (codec hevc)
 V....D mjpeg_vaapi          MJPEG (VAAPI) (codec mjpeg)
 V....D mpeg2_vaapi          MPEG-2 (VAAPI) (codec mpeg2video)
 V....D vp8_vaapi            VP8 (VAAPI) (codec vp8)
 V....D vp9_vaapi            VP9 (VAAPI) (codec vp9)

It appears that I can install and use the Intel Media Driver according to the ffmpeg wiki. There are versions of the Intel Media Driver out there, but the latest version I've found is 21.1.3 provided by RPMFusion, which doesn't support Gemini Lake. Besides, the latest versions of required libraries in the AlmaLinux repos are libva 2.11.0 from appstream and intel-gmmlib 21.1.2 from epel which also doesn't support Gemini Lake.

Since I need newer versions of this software, I might as well aim to have the latest version (22.6.6). The latest version requires libva 2.17.0 and intel-gmmlib 22.3.3, which as I mentioned isn't provided by default in AlmaLinux.

Unfortunately it seems like I need to build all of this myself. One goal I'm aiming for to build everything as self-contained RPMs using Docker containers. This is mainly so I don't pollute the machine with build stuff just to install libraries, and partly because I love containerization in general. I want the installation itself to be as simple as running dnf install, just like other software packages.

Building libva

Since I'm trying to build an RPM, I'm going to grab the specfile from the existing libva package that's in the AlmaLinux appstream repo.

I'm going to start an AlmaLinux container to avoid polluting the host.

docker container run -it --rm -v `pwd`:/rpms almalinux:9.1 bash

Then grab the specfile!

# I need dnf-command(download) so I can download SRPMs from the appstream-source repo.
# I need rpmdevtools so I can extract the contents of the SRPM, which is where the specfile lives.
dnf -y install 'dnf-command(download)' rpmdevtools

# Download libva-2.11.0-5.el9.src.rpm!
dnf --enablerepo=appstream-source download --source libva

# Extract the SRPM.
rpmdev-extract libva-2.11.0-5.el9.src.rpm

# Copy libva.spec out to the host.
cp libva-2.11.0-5.el9.src/libva.spec /rpms/

Pop the specfile into a GitHub gist so I can use it in later build commands, and so I have revision history to changes I make to it.

I ended up ripping out the Wayland and X11 stuff because I'm only interested in DRM, which allows VAAPI to interface with the GPU without any GUI being involved. After all, this is a headless server. I probably didn't need to remove that stuff, but what's done is done and I'm too lazy to rectify the mistake.

I'm going to start another AlmaLinux container so I can build the RPMs.

docker container run -it --rm -v `pwd`:/rpms almalinux:9.1 bash
# Install some generic build tools.
dnf -y install gcc make automake autoconf pkgconf pkgconf-pkg-config rpm-build rpmdevtools

# Install libva-specific development libraries.
dnf -y install libtool libudev-devel libdrm-devel libpciaccess-devel

# Add a builduser, then switch to it.
# Nobody should build software when running as root.
adduser --system --create-home builduser
su - builduser

# Set the libva version I'm going to be building.
# I'm just using it to expand download URLs later.
LIBVA_VERSION="2.17.0"

# Set up the ~/rpmbuild directory so rpmbuild can, well, build.
rpmdev-setuptree

# Download the specfile from the gist I made earlier.
curl -L -O \
    https://gist.githubusercontent.com/estelsmith/ac3f14c352970c2b2a1b2441f8f086c6/raw/libva.spec

# Download the libva source code.
curl -L \
    https://github.com/intel/libva/releases/download/${LIBVA_VERSION}/libva-${LIBVA_VERSION}.tar.bz2 \
    -o ~/rpmbuild/SOURCES/libva-${LIBVA_VERSION}.tar.bz2

# Build the RPMs (and SRPMs) for libva.
rpmbuild -ba libva.spec

# Copy built RPMs out to the host.
cp rpmbuild/RPMS/x86_64/* /rpms/

Hooray! There's libva done. Two more to go! At some point I probably want to also build libva-utils, but for now this is good enough.

Building intel-gmmlib

As with libva, I have a good starting point for building an RPM for this library. AlmaLinux has intel-gmmlib in the EPEL repository, even though it's old. I'm going to take the specfile from there and update it to build the newest version.

Again, I'm going to do this in a container.

docker container run -it --rm -v `pwd`:/rpms almalinux:9.1 bash

Then, grab the specfile.

# Install dnf-command(download) so I can download SRPMs from the epel-source repo.
# Install rpmdevtools so I can extract the contents of the SRPM, which is where the specfile lives.
# epel-release adds the EPEL YUM repository to the system.
dnf -y install 'dnf-command(download)' rpmdevtools epel-release

# Download intel-gmmlib-21.1.2-1.el9.src.rpm!
dnf --enablerepo=epel-source download --source intel-gmmlib

# Extract the SRPM.
rpmdev-extract intel-gmmlib-21.1.2-1.el9.src.rpm

# Copy the specfile out to the host.
cp intel-gmmlib-21.1.2-1.el9.src/intel-gmmlib.spec /rpms/

Pop that bad boy into a GitHub gist and make some minor edits. The only things I really needed to modify here were the version numbers.

I'm going to start yet another AlmaLinux container so I can build the RPMs.

docker container run -it --rm -v `pwd`:/rpms almalinux:9.1 bash
# Install some generic build tools.
dnf -y install cmake gcc gcc-c++ rpm-build rpmdevtools

# Add a builduser, then switch to it.
# Nobody should build software when running as root.
adduser --system --create-home builduser
su - builduser

# Set the intel-gmmlib version I'm going to be building.
# I'm just using it to expand download URLs later.
GMMLIB_VERSION="22.3.3"

# Set up the ~/rpmbuild directory so rpmbuild can, well, build.
rpmdev-setuptree

# Download the specfile from the gist I made earlier.
curl -L -O \
    https://gist.githubusercontent.com/estelsmith/782877dcf5c0c8cd5dc471072025966d/raw/intel-gmmlib.spec

# Download the intel-gmmlib source code.
curl -L \
    https://github.com/intel/gmmlib/archive/refs/tags/intel-gmmlib-${GMMLIB_VERSION}.tar.gz \
    -o ~/rpmbuild/SOURCES/intel-gmmlib-${GMMLIB_VERSION}.tar.gz

# Build the RPMs (and SRPMs) for intel-gmmlib.
rpmbuild -ba intel-gmmlib.spec

# Copy built RPMs out to the host.
cp rpmbuild/RPMS/x86_64/* /rpms/

Building intel-media-driver

Now that I have the dependencies built, I can focus on what I really came for. Luckily RPMFusion has a specfile specifically for version 22.6.6, but it's not available for AlmaLinux 9.1 yet. I shouldn't need to modify this much, if at all. So naturally I took the files and popped them into a GitHub gist just in case.

As it turns out, I didn't need to update anything but the specfile's version. Thanks, RPMFusion team!

As usual, I'm going to start another AlmaLinux container for the build.

docker container run -it --rm -v `pwd`:/rpms almalinux:9.1 bash
# Generic build tools, go!
dnf -y install cmake gcc gcc-c++ rpm-build rpmdevtools epel-release

# More specific devel libs that intel-media-driver needs.
dnf -y install libpciaccess-devel libX11-devel libappstream-glib cmrt-devel

# Hey look, it's those RPMs I built earlier!
# They're required for intel-media-driver, so install them.
cd /rpms
dnf -y install libva-devel-2.17.0-1.el9.x86_64.rpm intel-gmmlib-devel-22.3.3-1.el9.x86_64.rpm

# Repeat after me. I will not build software as root.
adduser --system --create-home builduser
su - builduser

# Following tradition, I'm setting the version in an easily-updatable place.
DRIVER_VERSION="22.6.6"

# ~/rpmbuild dir!
rpmdev-setuptree

# Download all my files from the gist created earlier.
curl -L -O \
    https://gist.githubusercontent.com/estelsmith/0d93598631f3fc6f4e3a7a5e7ef4d3cb/raw/intel-media-driver.spec
curl -L \
    https://gist.githubusercontent.com/estelsmith/0d93598631f3fc6f4e3a7a5e7ef4d3cb/raw/intel-media-driver-gen8-9-10-perf.patch \
    -o ~/rpmbuild/SOURCES/intel-media-driver-gen8-9-10-perf.patch
curl -L \
    https://gist.githubusercontent.com/estelsmith/0d93598631f3fc6f4e3a7a5e7ef4d3cb/raw/intel-media-driver.metainfo.xml \
    -o ~/rpmbuild/SOURCES/intel-media-driver.metainfo.xml

# Download the intel-media-driver sourcecode.
curl -L \
    https://github.com/intel/media-driver/archive/intel-media-${DRIVER_VERSION}.tar.gz \
    -o ~/rpmbuild/SOURCES/intel-media-${DRIVER_VERSION}.tar.gz

# Build the RPM (and SRPM) for intel-media-driver.
rpmbuild -ba intel-media-driver.spec

# Copy the built RPM out to the host.
cp rpmbuild/RPMS/x86_64/* /rpms/

Does it work, though?

The /dev/dri devices didn't show up after installing the RPMs. I ultimately needed to install the kernel-modules package, which provides the basic GPU drivers, and then reboot. After that, /dev/dri showed up!

# find /dev/dri
/dev/dri
/dev/dri/by-path
/dev/dri/by-path/pci-0000:00:02.0-card
/dev/dri/by-path/pci-0000:00:02.0-render
/dev/dri/card0
/dev/dri/renderD128

But still the question remains, can I do hardware-accelerated encoding now? I guess there's only one way to find out.

That's right, it's time to start another container!! Since I want to use hardware transcoding in Tdarr, I may as well test the transcoding from within a Tdarr container.

docker container run \
    -it --rm \
    --device /dev/dri \
    --entrypoint bash \
    ghcr.io/haveagitgat/tdarr:2.00.19.1

Then, run a couple of simple tests. I'm going to attempt to transcode a 2160p H.265 video down to 1080p H.264.

# Download a sample 2160p H.265 video.
curl -L -O https://samples.tdarr.io/api/v1/samples/sample__2160__libx265__aac__30s__video.mkv

# Run a test with no hardware acceleration first, to establish a baseline.
# It starts out strong-ish at 30fps but quickly drops to 18fps, a mere 0.29x speed. This simply won't do.
ffmpeg \
    -i sample__2160__libx265__aac__30s__video.mkv \
    -map v:0 \
    -c:v libx264 \
    -s hd1080 \
    -b:v 4M \
    downscaled.mkv

# Run a test using VAAPI.
# Woosh! It maintains 95-100fps the entire transcode, a nice 1.6x speed!
ffmpeg \
    -hwaccel vaapi \
    -hwaccel_output_format vaapi \
    -i sample__2160__libx265__aac__30s__video.mkv \
    -map v:0 \
    -c:v h264_vaapi \
    -vf 'scale_vaapi=w=1920:h=1080' \
    -b:v 4M \
    downscaled.mkv

Easily a 5x speed improvement! Not bad for an anemic thin client turned server.

With hardware transcoding now available, I hurridly plugged the machine into Tdarr as a node so I could use it to help transcode my library. In the end, it ended up being able to handle 3 transcodes at a time averaging 60 fps on each transcode, a whopping 180 fps in total hovering at 75% CPU!

Where to next?

Honestly, I'm probably going to buy a couple more Wyse 5070 machines and plug them into Tdarr and use them for other things. I don't want to go through this procedure with each machine, though. Yes, I could copy around the RPMs I already have, but that feels sloppy.

Since I have a CI system set up in my homelab, I will probably automate building each of these libraries as a CI pipeline. Then, I'll build an RPM repository and have the pipelines push the resulting packages into it.

From there my machines would be a simple dnf install away from having hardware transcoding abilities.

But that's going to have to be for another day. At the moment I'm just happy to get better utilization out of my little homelab addition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment