Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
This gist will generate an Intel QSV-enabled FFmpeg build using the open source Intel Media SDK. Testbed used: Ubuntu 18.04LTS. A fallback is also provided for the intel vaapi driver where needed.

Build FFmpeg with Intel's QSV enablement on an Intel-based validation test-bed:

Build platform: Ubuntu 18.04LTS

Ensure the platform is up to date:

sudo apt update && sudo apt -y upgrade && sudo apt -y dist-upgrade

Install baseline dependencies first (inclusive of OpenCL headers+)

sudo apt-get -y install autoconf automake build-essential libass-dev libtool pkg-config texinfo zlib1g-dev libva-dev cmake mercurial libdrm-dev libvorbis-dev libogg-dev git libx11-dev libperl-dev libpciaccess-dev libpciaccess0 xorg-dev intel-gpu-tools opencl-headers libwayland-dev xutils-dev ocl-icd-* libssl-dev

Then add the Oibaf PPA, needed to install the latest development headers for libva:

sudo add-apt-repository ppa:oibaf/graphics-drivers
sudo apt-get update && sudo apt-get -y upgrade && sudo apt-get -y dist-upgrade

Configure git:

This will be needed by some projects below, such as when building opencl-clang. Use your credentials, as you see fit:

git config --global user.name "FirstName LastName"
git config --global user.email "your@email.com"

Then proceed.

To address linker problems down the line with Ubuntu 18.04LTS:

Referring to this: https://forum.openframeworks.cc/t/ubuntu-unable-to-compile-missing-glx-mesa/29367/2

Create the following symlink as shown:

sudo ln -s /usr/lib/x86_64-linux-gnu/libGLX_mesa.so.0 /usr/lib/x86_64-linux-gnu/libGLX_mesa.so

Build the latest libva and all drivers from source:

Setup build environment:

Work space init:

mkdir -p ~/vaapi
mkdir -p ~/ffmpeg_build
mkdir -p ~/ffmpeg_sources
mkdir -p ~/bin

Build the dependency chain as shown, starting with installing the latest build of libdrm. This is needed to enable the cl_intel_va_api_media_sharing extension, needed when deriving OpenCL device initialization interop in FFmpeg, as illustrated later on in the documentation:

cd ~/vaapi
git clone https://anongit.freedesktop.org/git/mesa/drm.git libdrm
cd libdrm
./autogen.sh --prefix=/usr --enable-udev
time make -j$(nproc) VERBOSE=1
sudo make -j$(nproc) install
sudo ldconfig -vvvv

Then proceed with libva:

1. Libva :

Libva is an implementation for VA-API (Video Acceleration API)

VA-API is an open-source library and API specification, which provides access to graphics hardware acceleration capabilities for video processing. It consists of a main library and driver-specific acceleration backends for each supported hardware vendor. It is a prerequisite for building the VAAPI driver components below.

cd ~/vaapi
git clone https://github.com/01org/libva
cd libva
./autogen.sh --prefix=/usr --libdir=/usr/lib/x86_64-linux-gnu
time make -j$(nproc) VERBOSE=1
sudo make -j$(nproc) install
sudo ldconfig -vvvv

2. Gmmlib:

The Intel(R) Graphics Memory Management Library provides device specific and buffer management for the Intel(R) Graphics Compute Runtime for OpenCL(TM) and the Intel(R) Media Driver for VAAPI.

The component is a prerequisite to the Intel Media driver build step below.

To build this, create a workspace directory within the vaapi sub directory and run the build:

mkdir -p ~/vaapi/workspace
cd ~/vaapi/workspace
git clone https://github.com/intel/gmmlib
mkdir -p build
cd build
cmake -DCMAKE_BUILD_TYPE= Release ../gmmlib
make -j$(nproc)

Then install the package:

sudo make -j$(nproc) install 

And proceed.

3. Intel Media driver:

The Intel(R) Media Driver for VAAPI is a new VA-API (Video Acceleration API) user mode driver supporting hardware accelerated decoding, encoding, and video post processing for GEN based graphics hardware, released under the MIT license.

cd ~/vaapi/workspace
git clone https://github.com/intel/media-driver
cd media-driver
git submodule init
git pull
mkdir -p ~/vaapi/workspace/build_media
cd ~/vaapi/workspace/build_media

Configure the project with cmake:

cmake ../media-driver \
-DMEDIA_VERSION="2.0.0" \
-DBS_DIR_GMMLIB=$PWD/../gmmlib/Source/GmmLib/ \
-DBS_DIR_COMMON=$PWD/../gmmlib/Source/Common/ \
-DBS_DIR_INC=$PWD/../gmmlib/Source/inc/ \
-DBS_DIR_MEDIA=$PWD/../media-driver \
-DCMAKE_INSTALL_PREFIX=/usr \
-DCMAKE_INSTALL_LIBDIR=/usr/lib/x86_64-linux-gnu \
-DINSTALL_DRIVER_SYSCONF=OFF \
-DLIBVA_DRIVERS_PATH=/usr/lib/x86_64-linux-gnu/dri

Then build the media driver:

time make -j$(nproc) VERBOSE=1

Then install the project:

sudo make -j$(nproc) install VERBOSE=1

Add yourself to the video group:

sudo usermod -a -G video $USER

Now, export environment variables as shown below:

LIBVA_DRIVERS_PATH=/usr/lib/x86_64-linux-gnu/dri
LIBVA_DRIVER_NAME=iHD

Put that in /etc/environment.

And for the opensource driver (fallback, see notice below):

Export environment variables as shown below:

LIBVA_DRIVERS_PATH=/usr/lib/x86_64-linux-gnu/dri
LIBVA_DRIVER_NAME=i965

Put that in /etc/environment.

Notice: You should ONLY use the i965 driver for testing and validation only. For QSV-based deployments in production, ensure that iHD is the value set for the LIBVA_DRIVER_NAME variable, otherwise FFmpeg's QSV-based encoders will fail to initialize. Note that VAAPI is also supported by the iHD driver albeit to a limited feature-set, as explained in the last section.

Fallback for the Intel Opensource VAAPI driver:

  1. cmrt:

This is the C for Media Runtime GPU Kernel Manager for Intel G45 & HD Graphics family. it's a prerequisite for building the intel-hybrid-driver package on supported platforms.

cd ~/vaapi
git clone https://github.com/01org/cmrt
cd cmrt
./autogen.sh --prefix=/usr --libdir=/usr/lib/x86_64-linux-gnu
time make -j$(nproc) VERBOSE=1
sudo make -j$(nproc) install
  1. intel-hybrid-driver:

This package provides support for WebM project VPx codecs. GPU acceleration is provided via media kernels executed on Intel GEN GPUs. The hybrid driver provides the CPU bound entropy (e.g., CPBAC) decoding and manages the GEN GPU media kernel parameters and buffers.

This package grants access to the VPX-series hybrid decode capabilities on supported hardware configurations,, namely Haswell and Skylake. Do not build this target on unsupported platforms.

Related, see this commit regarding the hybrid driver initialization failure on platforms where its' not relevant.

cd ~/vaapi
git clone https://github.com/01org/intel-hybrid-driver
cd intel-hybrid-driver
./autogen.sh --prefix=/usr --libdir=/usr/lib/x86_64-linux-gnu
time make -j$(nproc) VERBOSE=1
sudo make -j$(nproc) install
  1. intel-vaapi-driver:

This package provides the VA-API (Video Acceleration API) user mode driver for Intel GEN Graphics family SKUs. The current video driver back-end provides a bridge to the GEN GPUs through the packaging of buffers and commands to be sent to the i915 driver for exercising both hardware and shader functionality for video decode, encode, and processing.

it also provides a wrapper to the intel-hybrid-driver when called up to handle VP8/9 hybrid decode tasks on supported hardware (when configured with the --enable-hybrid-codec option as shown below:).

cd ~/vaapi
git clone https://github.com/01org/intel-vaapi-driver
cd intel-vaapi-driver
./autogen.sh --prefix=/usr --libdir=/usr/lib/x86_64-linux-gnu --enable-hybrid-codec
time make -j$(nproc) VERBOSE=1
sudo make -j$(nproc) install

However, on Kabylake and newer, omit this as shown since its' not needed:

cd ~/vaapi
git clone https://github.com/intel/intel-vaapi-driver
cd intel-vaapi-driver
./autogen.sh --prefix=/usr --libdir=/usr/lib/x86_64-linux-gnu 
time make -j$(nproc) VERBOSE=1
sudo make -j$(nproc) install

Proceed:

4. libva-utils:

This package provides a collection of tests for VA-API, such as vainfo, needed to validate a platform's supported features (encode, decode & postproc attributes on a per-codec basis by VAAPI entry points information).

cd ~/vaapi
git clone https://github.com/intel/libva-utils
cd libva-utils
./autogen.sh --prefix=/usr --libdir=/usr/lib/x86_64-linux-gnu
time make -j$(nproc) VERBOSE=1
sudo make -j$(nproc) install

At this point, issue a reboot:

sudo systemctl reboot

Then on resume, proceed with the steps below, installing the Intel OpenCL platform (Neo):

Before you proceed with the iMSDK:

It is recommended that you build the Intel Neo OpenCL runtime:

Justification: This will allow for Intel's MediaSDK OpenCL inter-op back-end to be built.

Note: There's also a Personal Package Archive (PPA) for this that you can add, allowing you to skip the manual build step, as shown:

sudo add-apt-repository ppa:intel-opencl/intel-opencl
sudo apt-get update

Then install the packages:

sudo apt install intel-*

Note that the PPA builds are a bit behind the upstream stack, and as such, these needing the latest version should use the build steps below.

Install the dependencies for the OpenCL back-end:

Build dependencies:

sudo apt-get install ccache flex bison cmake g++ git patch zlib1g-dev autoconf xutils-dev libtool pkg-config libpciaccess-dev libz-dev

Create the project structure:

mkdir -p ~/intel-compute-runtime/workspace

Within this workspace directory, fetch the sources for the required dependencies:

cd ~/intel-compute-runtime/workspace
git clone -b release_70 https://github.com/llvm-mirror/llvm llvm_source
git clone -b release_70 https://github.com/llvm-mirror/clang llvm_source/tools/clang
git clone -b ocl-open-70 https://github.com/intel/opencl-clang llvm_source/projects/opencl-clang
git clone -b llvm_release_70 https://github.com/KhronosGroup/SPIRV-LLVM-Translator llvm_source/projects/llvm-spirv
git clone https://github.com/intel/llvm-patches llvm_patches
git clone https://github.com/intel/intel-graphics-compiler igc
git clone https://github.com/intel/compute-runtime neo

Create a build directory for the Intel Graphics Compiler under the workspace:

mkdir -p ~/intel-compute-runtime/workspace/build_igc

Then build:

cd ~/intel-compute-runtime/workspace/build_igc
cmake ../igc/IGC
time make -j$(nproc) VERBOSE=1

Recommended: Generate Debian archives for installation:

time make -j$(nproc) package VERBOSE=1

Install:

sudo dpkg -i *.deb

(Ran in the build directory) will suffice.

To install directly without relying on the package manager:

On whatever Linux distribution you're on, you can also run:

sudo make -j$(nproc) install

If you prefer to skip the generated binary artifacts by cpack. This may solve package installation and dependency issues that some of you are encountering.

Then proceed.

Next, build and install the compute runtime project. Start by creating a separate build directory for it:

mkdir -p ~/intel-compute-runtime/workspace/build_icr
cd ~/intel-compute-runtime/workspace/build_icr
cmake -DBUILD_TYPE=Release -DCMAKE_BUILD_TYPE=Release -DSKIP_UNIT_TESTS=1 ../neo
time make -j$(nproc) package VERBOSE=1

Then install the deb archives:

sudo dpkg -i *.deb 

From the build directory.

To install directly without relying on the package manager:

On whatever Linux distribution you're on, you can also run:

sudo make -j$(nproc) install

If you prefer to skip the generated binary artifacts by cpack. This may solve package installation and dependency issues that some of you are encountering.

Testing:

Use clinfo and confirm that the ICD is detected.

Optionally, run Luxmark and confirm that Intel Neo's OpenCL platform is detected and usable.

Be aware that Luxmark, among others, require freeglut, which you can install by running:

sudo apt install freeglut3*

4. Build Intel's MSDK:

This package provides an API to access hardware-accelerated video decode, encode and filtering on Intel® platforms with integrated graphics. It is supported on platforms that the intel-media-driver is targeted for.

For supported features per generation, see this.

Build steps:

(a). Fetch the sources into the working directory ~/vaapi:

cd ~/vaapi
git clone https://github.com/Intel-Media-SDK/MediaSDK msdk
cd msdk
git submodule init
git pull

(b). Configure the build:

mkdir -p ~/vaapi/build_msdk
cd ~/vaapi/build_msdk
cmake -DCMAKE_BUILD_TYPE=Release -DENABLE_WAYLAND=ON -DENABLE_X11_DRI3=ON -DENABLE_OPENCL=ON  ../msdk
time make -j$(nproc) VERBOSE=1
sudo make install -j$(nproc) VERBOSE=1

CMake will automatically detect the platform you're on and enable the platform-specific hooks needed for a working build.

Create a library config file for the iMSDK:

sudo nano /etc/ld.so.conf.d/imsdk.conf

Content:

/opt/intel/mediasdk/lib
/opt/intel/mediasdk/plugins

Then run:

sudo ldconfig -vvvv

To proceed.

When done, issue a reboot:

 sudo systemctl reboot

Build a usable FFmpeg binary with the iMSDK:

Include extra components as needed:

(a). Build and deploy nasm: Nasm is an assembler for x86 optimizations used by x264 and FFmpeg. Highly recommended or your resulting build may be very slow.

Note that we've now switched away from Yasm to nasm, as this is the current assembler that x265,x264, among others, are adopting.

cd ~/ffmpeg_sources
wget wget http://www.nasm.us/pub/nasm/releasebuilds/2.14rc0/nasm-2.14rc0.tar.gz
tar xzvf nasm-2.14rc0.tar.gz
cd nasm-2.14rc0
./configure --prefix="$HOME/ffmpeg_build" --bindir="$HOME/bin" 
make -j$(nproc) VERBOSE=1
make -j$(nproc) install
make -j$(nproc) distclean

(b). Build and deploy libx264 statically: This library provides a H.264 video encoder. See the H.264 Encoding Guide for more information and usage examples. This requires ffmpeg to be configured with --enable-gpl --enable-libx264.

cd ~/ffmpeg_sources
git clone http://git.videolan.org/git/x264.git -b stable
cd x264/
PATH="$HOME/bin:$PATH" ./configure --prefix="$HOME/ffmpeg_build" --enable-static --enable-shared
PATH="$HOME/bin:$PATH" make -j$(nproc) VERBOSE=1
make -j$(nproc) install VERBOSE=1
make -j$(nproc) distclean

(c ). Build and configure libx265: This library provides a H.265/HEVC video encoder. See the H.265 Encoding Guide for more information and usage examples.

cd ~/ffmpeg_sources
hg clone https://bitbucket.org/multicoreware/x265
cd ~/ffmpeg_sources/x265/build/linux
PATH="$HOME/bin:$PATH" cmake -G "Unix Makefiles" -DCMAKE_INSTALL_PREFIX="$HOME/ffmpeg_build" -DENABLE_SHARED:bool=off ../../source
make -j$(nproc) VERBOSE=1
make -j$(nproc) install VERBOSE=1
make -j$(nproc) clean VERBOSE=1

(d). Build and deploy the libfdk-aac library: This provides an AAC audio encoder. See the AAC Audio Encoding Guide for more information and usage examples. This requires ffmpeg to be configured with --enable-libfdk-aac (and --enable-nonfree if you also included --enable-gpl).

cd ~/ffmpeg_sources
wget -O fdk-aac.tar.gz https://github.com/mstorsjo/fdk-aac/tarball/master
tar xzvf fdk-aac.tar.gz
cd mstorsjo-fdk-aac*
autoreconf -fiv
./configure --prefix="$HOME/ffmpeg_build" --disable-shared
make -j$(nproc)
make -j$(nproc) install
make -j$(nproc) distclean

(e). Build and configure libvpx:

cd ~/ffmpeg_sources
git clone https://github.com/webmproject/libvpx
cd libvpx
./configure --prefix="$HOME/ffmpeg_build" --enable-runtime-cpu-detect --enable-vp9 --enable-vp8 \
--enable-postproc --enable-vp9-postproc --enable-multi-res-encoding --enable-webm-io --enable-better-hw-compatibility --enable-vp9-highbitdepth --enable-onthefly-bitpacking --enable-realtime-only --cpu=native --as=nasm 
time make -j$(nproc)
time make -j$(nproc) install
time make clean -j$(nproc)
time make distclean

(f). Build LibVorbis:

cd ~/ffmpeg_sources
wget -c -v http://downloads.xiph.org/releases/vorbis/libvorbis-1.3.6.tar.xz
tar -xvf libvorbis-1.3.6.tar.xz
cd libvorbis-1.3.6
./configure --enable-static --prefix="$HOME/ffmpeg_build"
time make -j$(nproc)
time make -j$(nproc) install
time make clean -j$(nproc)
time make distclean

(g). Build FFmpeg (with OpenCL enabled):

Notes on API support:

The hardware can be accessed through a number of different APIs:

i.libmfx on Linux:

This is a library from Intel which can be installed as part of the Intel Media SDK, and supports a subset of encode and decode cases.

ii.vaapi on Linux:

A fully opensource stack, dependent on libva and an appropriate VAAPI driver, that can be configured at runtime via the LIBVA-related environment variables (as shown above).

In our use case, that is the backend that we will be using throughout this validation with this FFmpeg build:

cd ~/ffmpeg_sources
git clone https://github.com/FFmpeg/FFmpeg -b master
cd FFmpeg
PATH="$HOME/bin:$PATH" PKG_CONFIG_PATH="$HOME/ffmpeg_build/lib/pkgconfig:/opt/intel/mediasdk/lib/pkgconfig" ./configure \
  --pkg-config-flags="--static" \
  --prefix="$HOME/bin" \
  --bindir="$HOME/bin" \
  --extra-cflags="-I$HOME/bin/include" \
  --extra-ldflags="-L$HOME/bin/lib" \
  --extra-cflags="-I/opt/intel/mediasdk/include" \
  --extra-ldflags="-L/opt/intel/mediasdk/lib" \
  --extra-ldflags="-L/opt/intel/mediasdk/plugins" \
  --enable-libmfx \
  --enable-vaapi \
  --enable-opencl \
  --disable-debug \
  --enable-libvorbis \
  --enable-libvpx \
  --enable-libdrm \
  --enable-gpl \
  --cpu=native \
  --enable-libfdk-aac \
  --enable-libx264 \
  --enable-libx265 \
  --enable-openssl \
  --extra-libs=-lpthread \
  --enable-nonfree 
PATH="$HOME/bin:$PATH" make -j$(nproc) 
make -j$(nproc) install 
make -j$(nproc) distclean 
hash -r

Note: To get debug builds, add the --enable-debug=3 configuration flag and omit the distclean step and you'll find the ffmpeg_g binary under the sources subdirectory.

We only want the debug builds when an issue crops up and a gdb trace may be required for debugging purposes. Otherwise, leave this omitted for production environments.

Sample snippets to test the new encoders:

Confirm that the VAAPI & QSV-based encoders have been built successfully:

(a). VAAPI:

ffmpeg  -hide_banner -encoders | grep vaapi 

 V..... h264_vaapi           H.264/AVC (VAAPI) (codec h264)
 V..... hevc_vaapi           H.265/HEVC (VAAPI) (codec hevc)
 V..... mjpeg_vaapi          MJPEG (VAAPI) (codec mjpeg)
 V..... mpeg2_vaapi          MPEG-2 (VAAPI) (codec mpeg2video)
 V..... vp8_vaapi            VP8 (VAAPI) (codec vp8)
 V..... vp9_vaapi            VP9 (VAAPI) (codec vp9)


(b). QSV:

 V..... h264_qsv             H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10 (Intel Quick Sync Video acceleration) (codec h264)
 V..... hevc_qsv             HEVC (Intel Quick Sync Video acceleration) (codec hevc)
 V..... mjpeg_qsv            MJPEG (Intel Quick Sync Video acceleration) (codec mjpeg)
 V..... mpeg2_qsv            MPEG-2 video (Intel Quick Sync Video acceleration) (codec mpeg2video)

See the help documentation for each encoder in question:

ffmpeg -hide_banner -h encoder='encoder name'

Test the encoders;

Using GNU parallel, we will encode some mp4 files (4k H.264 test samples, 40 minutes each, AAC 6-channel audio) on the ~/src path on the system to VP8 and HEVC respectively using the examples below. Note that I've tuned the encoders to suit my use-cases, and re-scaling to 1080p is enabled. Adjust as necessary (and compensate as needed if ~/bin is not on the system path).

To VP8, launching 10 encode jobs simultaneously:

parallel -j 10 --verbose 'ffmpeg -loglevel debug -threads 4 -hwaccel vaapi -i "{}"  -vaapi_device /dev/dri/renderD129 -c:v vp8_vaapi -loop_filter_level:v 63 -loop_filter_sharpness:v 15 -b:v 4500k -maxrate:v 7500k -vf 'format=nv12,hwupload,scale_vaapi=w=1920:h=1080' -c:a libvorbis -b:a 384k -ac 6 -f webm "{.}.webm"' ::: $(find . -type f -name '*.mp4')

To HEVC with GNU Parallel:

To HEVC Main Profile, launching 10 encode jobs simultaneously:

parallel -j 4 --verbose 'ffmpeg -loglevel debug -threads 4 -hwaccel vaapi -i "{}"  -vaapi_device /dev/dri/renderD129 -c:v hevc_vaapi -qp:v 19 -b:v 2100k -maxrate:v 3500k -vf 'format=nv12,hwupload,scale_vaapi=w=1920:h=1080' -c:a libvorbis -b:a 384k -ac 6 -f matroska "{.}.mkv"' ::: $(find . -type f -name '*.mp4')

FFmpeg QSV encoder usage notes:

I provide an example below that demonstrates the use of a complex filter chain with the QSV encoders in place for livestreaming purposes. Adapt as per your needs.

Complex Filter chain usage for variant stream encoding with Intel's QSV encoders:

Take the example snippet below, which takes the safest (and not necessarily the fastest route), utilizing a hybrid encoder approach (partial hwaccel with significant processor load):

ffmpeg -re -stream_loop -1 -threads n -loglevel debug -filter_complex_threads n \
-init_hw_device qsv=qsv:hw -hwaccel qsv -filter_hw_device qsv \
-i 'udp://$stream_url:$port?fifo_size=9000000' \
-filter_complex "[0:v]hwupload=extra_hw_frames=10,vpp_qsv=deinterlace=2,split=6[s0][s1][s2][s3][s4][s5]; \
[s0]hwupload=extra_hw_frames=10,scale_qsv=1920:1080:format=nv12[v0]; \
[s1]hwupload=extra_hw_frames=10,scale_qsv=1280:720:format=nv12[v1];
[s2]hwupload=extra_hw_frames=10,scale_qsv=960:540:format=nv12[v2];
[s3]hwupload=extra_hw_frames=10,scale_qsv=842:480:format=nv12[v3];
[s4]hwupload=extra_hw_frames=10,scale_qsv=480:360:format=nv12[v4];
[s5]hwupload=extra_hw_frames=10,scale_qsv=426:240:format=nv12[v5]" \
-b:v:0 2250k -c:v h264_qsv -a53cc 1 -rdo 1 -pic_timing_sei 1 -recovery_point_sei 1 -profile high -aud 1 \
-b:v:1 1750k -c:v h264_qsv -a53cc 1 -rdo 1 -pic_timing_sei 1 -recovery_point_sei 1 -profile high -aud 1 \
-b:v:2 1000k -c:v h264_qsv -a53cc 1 -rdo 1 -pic_timing_sei 1 -recovery_point_sei 1 -profile high -aud 1 \
-b:v:3 875k -c:v h264_qsv -a53cc 1 -rdo 1 -pic_timing_sei 1 -recovery_point_sei 1 -profile high -aud 1 \
-b:v:4 750k -c:v h264_qsv -a53cc 1 -rdo 1 -pic_timing_sei 1 -recovery_point_sei 1 -profile high -aud 1 \
-b:v:5 640k -c:v h264_qsv -a53cc 1 -rdo 1 -pic_timing_sei 1 -recovery_point_sei 1 -profile high -aud 1 \
-c:a aac -b:a 128k -ar 48000 -ac 2 \
-flags -global_header -f tee  \
-map "[v0]" -map "[v1]" -map "[v2]" -map "[v3]" -map "[v4]" -map "[v5]" -map 0:a:0 -map 0:a:1 \
"[select=\'v:0,a\':f=mpegts]udp:$stream_url_out:$port_out| \
 [select=\'v:0,a\':f=mpegts]udp://$stream_url_out:$port_out| \
 [select=\'v:0,a\':f=mpegts]udp://$stream_url_out:$port_out| \
 [select=\'v:1,a\':f=mpegts]udp://$stream_url_out:$port_out| \
 [select=\'v:1,a\':f=mpegts]udp://$stream_url_out:$port_out| \
 [select=\'v:1,a\':f=mpegts]udp://$stream_url_out:$port_out| \
 [select=\'v:2,a\':f=mpegts]udp://$stream_url_out:$port_out| \
 [select=\'v:2,a\':f=mpegts]udp://$stream_url_out:$port_out| \
 [select=\'v:2,a\':f=mpegts]udp://$stream_url_out:$port_out| \
 [select=\'v:3,a\':f=mpegts]udp://$stream_url_out:$port_out| \
 [select=\'v:3,a\':f=mpegts]udp://$stream_url_out:$port_out| \
 [select=\'v:3,a\':f=mpegts]udp://$stream_url_out:$port_out| \
 [select=\'v:4,a\':f=mpegts]udp://$stream_url_out:$port_out| \
 [select=\'v:4,a\':f=mpegts]udp://$stream_url_out:$port_out| \
 [select=\'v:4,a\':f=mpegts]udp://$stream_url_out:$port_out| \
 [select=\'v:5,a\':f=mpegts]udp://$stream_url_out:$port_out| \
 [select=\'v:5,a\':f=mpegts]udp://$stream_url_out:$port_out| \
 [select=\'v:5,a\':f=mpegts]udp://$stream_url_out:$port_out"

Template breakdown:

The ffmpeg snippet assumes the following:

  1. The UDP ingest stream specifier has one video stream and two separate audio streams, as shown in the explicit mapping above: -map "[v0]" -map "[v1]" -map "[v2]" -map "[v3]" -map "[v4]" -map "[v5]" -map 0:a:0 -map 0:a:1

The mapping is needed because the tee muxer (-f tee) makes no assumptions about the capabilities of the underlying muxer launched beneath the fifo process.

  1. We encode audio only once. Consider audio as a blocking encoder and minimize unnecessary encoder duplication to save on CPU cycles.

  2. We split the incoming streams into six, and in so doing:

(a). Allocate a single thread to each filter complex chain. Thus the value n set for -filter_complex_threads should match the number of split=n value.

(b). Allocate the total thread count for FFmpeg to the value specified above. This ensures that each encoder is fed only through a single thread, the optimal value for hardware-accelerated encoding.

The following notes about thread allocation in FFmpeg applies doubly so:

More encoder threads beyond a certain threshold increases latency and will have a higher encoding memory footprint. Quality degradation is more prominent with higher thread counts in constant bitrate modes and near-constant bitrate mode called VBV (video buffer verifier), due to increased encode delay. Keyframes need more data then other frame types to avoid pulsing poor quality keyframes.

Zero-delay or sliced thread mode (on supported encoders) has no delay, but this option farther worsens multi-threads quality in supported encoders.

It's therefore wise to limit thread counts on encodes where latency matters, as the perceived encoder throughput increase offsets any advantages it may bring in the long term.

  1. Through the tee muxer, we then allocate each encoded stream variant to an output, through the select statement in the bracketed clauses above. This allows us to generate exponentially more elementary streams than there are encoders.

  2. On the tuning options passed to the h264_qsv encoder:

(a). The hwupload filter must be appended with the extra_hw_frames=10 option, because the QSV encoder expects a fixed initial pool size.
Conceptually, this should happen automatically, but the problem is that the mfx plugins lack sufficient negotiation with the encoder to know how big the pool should be - if it's feeding into a scaler (such as the complex scale filter chain above), then probably 2 or 3 frames are sufficient, but if it's feeding into an encoder with look-ahead enabled then you might need over 100. As such, it's currently under user control and must be applied to ensure proper encoder initialization.

Without this option, the encoder will fail as shown below:

[AVHWFramesContext @ 0x3e26ec0] QSV requires a fixed frame pool size
[AVHWFramesContext @ 0x3e26ec0] Error creating an internal frame pool
[Parsed_hwupload_1 @ 0x3e26880] Failed to configure output pad on Parsed_hwupload_1
Error reinitializing filters!
Failed to inject frame into filter network: Invalid argument
Error while processing the decoded data for stream #0:0
[AVIOContext @ 0x3a68f80] Statistics: 0 seeks, 0 writeouts
[AVIOContext @ 0x3a6cd40] Statistics: 1022463 bytes read, 0 seeks
Conversion failed!

(b). When initializing the encoder, the hardware device node for libmfx must be initialized as shown:

-init_hw_device qsv=qsv:MFX_IMPL_hw_any -hwaccel qsv -filter_hw_device qsv

That ensures that the proper hardware accelerator node (qsv) is initialized with the proper device context (-init_hw_device qsv=qsv:hw) with device nodes for a hardware accelerator implementation being inherited (-hwaccel qsv) with an appropriate filter device (-filter_hw_device qsv) are initialized for resource allocation by the hwupload filter, the vpp_qsv post-processor (needed for advanced deinterlacing) and the scale_vpp filter (needed for texture format conversion to nv12, otherwise the encoder will fail).

If hardware scaling is undesired, the filter chain can be modified from:

[sn]hwupload=extra_hw_frames=10,scale_qsv=W:H:format=nv12[vn]

To:

[sn]hwupload=extra_hw_frames=10,scale_qsv=format=nv12[vn]

Where n is the stream specifier id inherited from the complex filter chain. Note that the texture conversion is mandatory, and cannot be skipped.

The other arguments passed to the encoder are optimal for smooth streaming, enabling automatic detection and use of closed captions, an advanced rate distortion algorithm and sensible bitrates and profile limits per encoder variant.

Special notes concerning performance:

If you're after a full hardware-accelerated transcode pipeline (use with caution as it may not work with all input formats), see the snippet below:

ffmpeg -re -stream_loop -1 -threads n -loglevel debug -filter_complex_threads n \
-c:v h264_qsv -hwaccel qsv \
-i 'udp://$stream_url:$port?fifo_size=9000000' \
-filter_complex "[0:v]vpp_qsv=deinterlace=2,split=6[s0][s1][s2][s3][s4][s5]; \
[s0]scale_qsv=1920:1080:format=nv12[v0]; \
[s1]scale_qsv=1280:720:format=nv12[v1];
[s2]scale_qsv=960:540:format=nv12[v2];
[s3]scale_qsv=842:480:format=nv12[v3];
[s4]scale_qsv=480:360:format=nv12[v4];
[s5]scale_qsv=426:240:format=nv12[v5]" \
-b:v:0 2250k -c:v h264_qsv -a53cc 1 -rdo 1 -pic_timing_sei 1 -recovery_point_sei 1 -profile high -aud 1 \
-b:v:1 1750k -c:v h264_qsv -a53cc 1 -rdo 1 -pic_timing_sei 1 -recovery_point_sei 1 -profile high -aud 1 \
-b:v:2 1000k -c:v h264_qsv -a53cc 1 -rdo 1 -pic_timing_sei 1 -recovery_point_sei 1 -profile high -aud 1 \
-b:v:3 875k -c:v h264_qsv -a53cc 1 -rdo 1 -pic_timing_sei 1 -recovery_point_sei 1 -profile high -aud 1 \
-b:v:4 750k -c:v h264_qsv -a53cc 1 -rdo 1 -pic_timing_sei 1 -recovery_point_sei 1 -profile high -aud 1 \
-b:v:5 640k -c:v h264_qsv -a53cc 1 -rdo 1 -pic_timing_sei 1 -recovery_point_sei 1 -profile high -aud 1 \
-c:a aac -b:a 128k -ar 48000 -ac 2 \
-flags -global_header -f tee  \
-map "[v0]" -map "[v1]" -map "[v2]" -map "[v3]" -map "[v4]" -map "[v5]" -map 0:a:0 -map 0:a:1 \
"[select=\'v:0,a\':f=mpegts]udp:$stream_url_out:$port_out| \
 [select=\'v:0,a\':f=mpegts]udp://$stream_url_out:$port_out| \
 [select=\'v:0,a\':f=mpegts]udp://$stream_url_out:$port_out| \
 [select=\'v:1,a\':f=mpegts]udp://$stream_url_out:$port_out| \
 [select=\'v:1,a\':f=mpegts]udp://$stream_url_out:$port_out| \
 [select=\'v:1,a\':f=mpegts]udp://$stream_url_out:$port_out| \
 [select=\'v:2,a\':f=mpegts]udp://$stream_url_out:$port_out| \
 [select=\'v:2,a\':f=mpegts]udp://$stream_url_out:$port_out| \
 [select=\'v:2,a\':f=mpegts]udp://$stream_url_out:$port_out| \
 [select=\'v:3,a\':f=mpegts]udp://$stream_url_out:$port_out| \
 [select=\'v:3,a\':f=mpegts]udp://$stream_url_out:$port_out| \
 [select=\'v:3,a\':f=mpegts]udp://$stream_url_out:$port_out| \
 [select=\'v:4,a\':f=mpegts]udp://$stream_url_out:$port_out| \
 [select=\'v:4,a\':f=mpegts]udp://$stream_url_out:$port_out| \
 [select=\'v:4,a\':f=mpegts]udp://$stream_url_out:$port_out| \
 [select=\'v:5,a\':f=mpegts]udp://$stream_url_out:$port_out| \
 [select=\'v:5,a\':f=mpegts]udp://$stream_url_out:$port_out| \
 [select=\'v:5,a\':f=mpegts]udp://$stream_url_out:$port_out"

So, what has changed here? For one:

(a). We have selected an appropriate QSV-based decoder based on the video codec type in the ingest feed (h264), assigned as -c:v h264_qsv and the -hwaccel qsv as the hwaccel before declaring the input (-i). If the ingest feed is MPEG-2, select the MPEG decoder (-c:v mpeg2_qsv) instead.

(b). We have dropped the manual H/W init (-init_hw_device qsv=qsv:MFX_IMPL_hw_any -hwaccel qsv -filter_hw_device qsv) and the hwupload video filter.

  1. The open source iMSDK has a frame encoder limit of 1000 for the HEVC-based encoder, and as such, the HEVC encoder components should only be used for evaluation purposes. These that require these functions should consult the proprietary licensed SDK. To specify the frame limit in ffmpeg, use the -vframes n option, where n is an integer.

  2. The iHD libva driver also provides similar VAAPI functionality as the opensource i965 driver, with a few discrepancies:

(a). It does not offer encode entry points for VP8 and VP9 codecs (yet). (b). As mentioned above, HEVC encoding is for evaluation purposes only and will limit the encode to a mere 1000 frames.

See the VAAPI features enabled with this iHD driver:

vainfo 
libva info: VA-API version 1.2.0
libva info: va_getDriverName() returns 0
libva info: User requested driver 'iHD'
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
libva info: Found init function __vaDriverInit_1_2
libva info: va_openDriver() returns 0
vainfo: VA-API version: 1.2 (libva 2.2.1.pre1)
vainfo: Driver version: Intel iHD driver - 2.0.0
vainfo: Supported profile and entrypoints
      VAProfileNone                   :	VAEntrypointVideoProc
      VAProfileNone                   :	VAEntrypointStats
      VAProfileMPEG2Simple            :	VAEntrypointVLD
      VAProfileMPEG2Simple            :	VAEntrypointEncSlice
      VAProfileMPEG2Main              :	VAEntrypointVLD
      VAProfileMPEG2Main              :	VAEntrypointEncSlice
      VAProfileH264Main               :	VAEntrypointVLD
      VAProfileH264Main               :	VAEntrypointEncSlice
      VAProfileH264Main               :	VAEntrypointFEI
      VAProfileH264Main               :	VAEntrypointEncSliceLP
      VAProfileH264High               :	VAEntrypointVLD
      VAProfileH264High               :	VAEntrypointEncSlice
      VAProfileH264High               :	VAEntrypointFEI
      VAProfileH264High               :	VAEntrypointEncSliceLP
      VAProfileVC1Simple              :	VAEntrypointVLD
      VAProfileVC1Main                :	VAEntrypointVLD
      VAProfileVC1Advanced            :	VAEntrypointVLD
      VAProfileJPEGBaseline           :	VAEntrypointVLD
      VAProfileJPEGBaseline           :	VAEntrypointEncPicture
      VAProfileH264ConstrainedBaseline:	VAEntrypointVLD
      VAProfileH264ConstrainedBaseline:	VAEntrypointEncSlice
      VAProfileH264ConstrainedBaseline:	VAEntrypointFEI
      VAProfileH264ConstrainedBaseline:	VAEntrypointEncSliceLP
      VAProfileVP8Version0_3          :	VAEntrypointVLD
      VAProfileHEVCMain               :	VAEntrypointVLD
      VAProfileHEVCMain               :	VAEntrypointEncSlice
      VAProfileHEVCMain               :	VAEntrypointFEI
      VAProfileHEVCMain10             :	VAEntrypointVLD
      VAProfileHEVCMain10             :	VAEntrypointEncSlice
      VAProfileVP9Profile0            :	VAEntrypointVLD
      VAProfileVP9Profile2            :	VAEntrypointVLD

Note: OpenCL enablement in both libx264 and FFmpeg are dependent on the following conditions:

(a). The flag --disable-opencl is removed from libx264's configuration.

(b). The flag --enable-opencl is present in FFmpeg's configure options.

(c ). The prerequisite packages for OpenCL development are present:

With OpenCL, the installable client drivers (ICDs) are normally issued with the accelerator's device drivers, namely:

1.The NVIDIA CUDA toolkit (and the device driver) for NVIDIA GPUs.
2. AMD's RoCM for GCN-class AMD hardware.
3. Intel's beignet and the newer Neo compute runtime, as on OUR platform.

The purpose of the installable client driver model is to allow multiple OpenCL platforms to coexist on the same platform. That way, multiple OpenCL accelerators, be they discrete GPUs paired with a combination of FPGAs and integrated GPUs can all coexist.

However, for linkage purposes, you'll require the ocl-icd package (which we installed earlier), which can be installed by:

sudo apt install ocl-icd-* 

Why ocl-icd? Simple: Whereas other ICDs may permit you to link against them directly, it is discouraged so as to limit the risk of unexpected runtime behavior. Assume ocl-icd to be the gold link target if your goal is to be platform-neutral as possible.

OpenCL in FFmpeg:

OpenCL's enablement in FFmpeg comes in two ways:

(a):. Some encoders, such as libx264, if built with OpenCL enablement, can utilize these capabilities for accelerated lookahead functions. The performance impact for this enablement will vary with the GPU on the platform, and with older GPUs, may slow down the encoder. Lower power platforms such as specific AMD APUs and their SoCs may see modest performance improvements at best, but on modern, high performance GPUs, your mileage may vary. Expect no miracles. The reason OpenCL lookahead is available for this library in particular is that the lookahead algorithms for OpenCL are easily parallelized.

For instance, you can combine the -hwaccel auto option which allows you to select the hardware-based accelerated decoding to use for the encode session with libx264. You can add this parameter with "auto" before input (if your x264 is compiled with OpenCL support you can try to add -x264opts param), for example:

ffmpeg -hwaccel auto -i input -vcodec libx264 -x264opts opencl output

(b):. FFmpeg, in particular, can utilize OpenCL with some filters, namely program_opencl and opencl_src as documented in the filters documentation, among others.

See the sample command below:

ffmpeg -hide_banner -v verbose -init_hw_device 
opencl=ocl:1.0 -filter_hw_device ocl -i 
"cheeks.mkv" -an -map_metadata -1 -sws_flags 
lanczos+accurate_rnd+full_chroma_int+full_chroma_inp -filter_complex 
"[0:v]yadif=0:0:0,hwupload,unsharp_opencl=lx=3:ly=3:la=0.5:cx=3:cy=3:ca=0.5,hwdownload,setdar=dar=16/9" 
 -r 25 -c:v h264_nvenc -preset:v llhq -bf 2 -g 50 -refs 3 -rc:v 
vbr_hq -rc-lookahead:v 32 -coder:v cabac -movflags 
+faststart -profile:v high -level 4.1 -pixel_format yuv420p -y 
"crunchy_cheeks.mp4"

List OpenCL platform devices:*

ffmpeg -hide_banner -v verbose -init_hw_device list
ffmpeg -hide_banner -v verbose -init_hw_device opencl
ffmpeg -hide_banner -v verbose -init_hw_device opencl:1.0 

For the filter, see:

ffmpeg -hide_banner -v verbose -h filter=unsharp_opencl 

Example:

On the test-bed:

ffmpeg -hide_banner -v verbose -init_hw_device list
Supported hardware device types:
vaapi
qsv
drm
opencl

And based on these platforms:

(a). QSV:

ffmpeg -hide_banner -v verbose -init_hw_device qsv
[AVHWDeviceContext @ 0x559f67501440] Opened VA display via X11 display :1.
[AVHWDeviceContext @ 0x559f67501440] libva: VA-API version 1.2.0
[AVHWDeviceContext @ 0x559f67501440] libva: va_getDriverName() returns 0
[AVHWDeviceContext @ 0x559f67501440] libva: User requested driver 'iHD'
[AVHWDeviceContext @ 0x559f67501440] libva: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
[AVHWDeviceContext @ 0x559f67501440] libva: Found init function __vaDriverInit_1_2
[AVHWDeviceContext @ 0x559f67501440] libva: va_openDriver() returns 0
[AVHWDeviceContext @ 0x559f67501440] Initialised VAAPI connection: version 1.2
[AVHWDeviceContext @ 0x559f67501440] Unknown driver "Intel iHD driver - 2.0.0", assuming standard behaviour.
[AVHWDeviceContext @ 0x559f67501040] Initialize MFX session: API version is 1.27, implementation version is 1.27
[AVHWDeviceContext @ 0x559f67501040] MFX compile/runtime API: 1.27/1.27
Hyper fast Audio and Video encoder

(b). VAAPI:

ffmpeg -hide_banner -v verbose -init_hw_device vaapi
[AVHWDeviceContext @ 0x56362b64d040] Opened VA display via X11 display :1.
[AVHWDeviceContext @ 0x56362b64d040] libva: VA-API version 1.2.0
[AVHWDeviceContext @ 0x56362b64d040] libva: va_getDriverName() returns 0
[AVHWDeviceContext @ 0x56362b64d040] libva: User requested driver 'iHD'
[AVHWDeviceContext @ 0x56362b64d040] libva: Trying to open /usr/lib/x86_64-linux-gnu/dri/iHD_drv_video.so
[AVHWDeviceContext @ 0x56362b64d040] libva: Found init function __vaDriverInit_1_2
[AVHWDeviceContext @ 0x56362b64d040] libva: va_openDriver() returns 0
[AVHWDeviceContext @ 0x56362b64d040] Initialised VAAPI connection: version 1.2
[AVHWDeviceContext @ 0x56362b64d040] Unknown driver "Intel iHD driver - 2.0.0", assuming standard behaviour.

(c). OpenCL:

ffmpeg -hide_banner -v verbose -init_hw_device opencl

[AVHWDeviceContext @ 0x55dee05a2040] 0.0: NVIDIA CUDA / GeForce GTX 1070 with Max-Q Design
[AVHWDeviceContext @ 0x55dee05a2040] 1.0: Intel(R) OpenCL HD Graphics / Intel(R) Gen9 HD Graphics NEO
[AVHWDeviceContext @ 0x55dee05a2040] More than one matching device found.
Device creation failed: -19.
Failed to set value 'opencl' for option 'init_hw_device': No such device
Error parsing global options: No such device

Now, you'll notice a platform init error for OpenCL, and that is because we did not pick up a specific device. When done properly, for both devices, this is the output you should get:

i. First OpenCL device:

ffmpeg -hide_banner -v verbose -init_hw_device opencl:1.0

[AVHWDeviceContext @ 0x562f64b66040] 1.0: Intel(R) OpenCL HD Graphics / Intel(R) Gen9 HD Graphics NEO
Hyper fast Audio and Video encoder

ii. Second OpenCL device:

ffmpeg -hide_banner -v verbose -init_hw_device opencl:0.1

ffmpeg -hide_banner -v verbose -init_hw_device opencl:0.0
[AVHWDeviceContext @ 0x55e7524fb040] 0.0: NVIDIA CUDA / GeForce GTX 1070 with Max-Q Design

Take note of the syntax used. On a platform with more than one OpenCL platform, the device ordinal must be selected from the OpenCL platform its' on, followed by the device index number.

Using the example above, you can see that this device has two OpenCL platforms, the Intel Neo stack and the NVIDIA CUDA stack. These platforms are opencl:1 and opencl:0 respectively. The devices are opencl:1:1 and opencl:0:0 respectively, where the first device ordinal is always zero (0).

An OpenCL-based tonemap filter example:

You can use an OpenCL-based tone map filters, allowing for HDR(HDR10/HLG) to SDR conversion with tone-mapping, with the example shown below:

ffmpeg -init_hw_device vaapi=va:/dev/dri/renderD128 -init_hw_device \
opencl=ocl@va -hwaccel vaapi -hwaccel_device va -hwaccel_output_format \
vaapi -i INPUT -filter_hw_device ocl -filter_complex \
'[0:v]hwmap,tonemap_opencl=t=bt2020:tonemap=linear:format=p010[x1]; \
[x1]hwmap=derive_device=vaapi:reverse=1' -c:v hevc_vaapi -profile 2 OUTPUT
@Brainiarc7

This comment has been minimized.

Copy link
Owner Author

commented Jul 6, 2018

Notes: Updated with instructions for full H/W-based encoding (less safe, but significantly faster on supported ingest stream codecs).

@matiaspl

This comment has been minimized.

Copy link

commented Aug 28, 2018

The open source driver does not seem to have the 1000 frame limit as far as HEVC goes. I'm wondering - are there any benefits of using VAAPI i965 vs VAAPI iHD vs QSV in terms of encoding performance/quality?

@Brainiarc7

This comment has been minimized.

Copy link
Owner Author

commented Aug 29, 2018

Hello @matiaspl,

Yes, there are multiple advantages in using the opensource i965 VAAPI driver over the iHD (proprietary) driver, mainly:

  1. Support for extra codecs such as VP8, VP9 and unrestricted HEVC encoding.

  2. Better mapping support (for hardware and filter setup). VAAPI with the opensource driver allows one to initialize and map multiple implementations at once (based on platform capabilities) with independent device assignments as needed on a system with more than one VAAPI implementation, say AMD's gallium (one or more device) combined with/paired with an Intel platform with Intel Integrated graphics. See this for more details.

@Brainiarc7

This comment has been minimized.

Copy link
Owner Author

commented Aug 29, 2018

When it comes to performance, with VAAPI, the main limitation is a dependence on so-called "hardware frames" that require uploading and swapping to and from RAM buffers. This is also codec independent; a fully hardware-accelerated VAAPI transcode (where the codec in question can be both encoded and decoded in hardware) will produce excellent results on supported hardware. However, if a portion of the encoding process (such as the decode phase) falls back to a software-based surface conversion (to a VAAPI H/W frame), your performance will be impacted.

This is the case with VAAPI and AMD's gallium implementation in particular where most codecs do not have entry points for decode.

@Brainiarc7

This comment has been minimized.

Copy link
Owner Author

commented Aug 29, 2018

Related, view the comments here by @Dalmat. He used a testing methodology (based on Netflix's VMAF , SSIM and PSNR) to objectively measure VAAPI's encoders on a Kabylake testbed for various codecs. I'll update these results when I'm able, but with his results, you can see what these encoders are capable of (at least when they launched over an year ago).

Do note that a lot of work has gone into these VAAPI encoders in FFmpeg over time, so expect some improvements, at the very least.

For instance, with VAAPI, you'll find support for features such as OpenCL-based tone map filters, allowing for HDR(HDR10/HLG) to SDR conversion with tone-mapping, with the example shown below:

ffmpeg -init_hw_device vaapi=va:/dev/dri/renderD128 -init_hw_device \
opencl=ocl@va -hwaccel vaapi -hwaccel_device va -hwaccel_output_format \
vaapi -i INPUT -filter_hw_device ocl -filter_complex \
'[0:v]hwmap,tonemap_opencl=t=bt2020:tonemap=linear:format=p010[x1]; \
[x1]hwmap=derive_device=vaapi:reverse=1' -c:v hevc_vaapi -profile 2 OUTPUT
@andreasunterhuber

This comment has been minimized.

Copy link

commented Oct 16, 2018

@Brainiarc7 thanks a lot for the excellent document and tutorial. I'm just curios if there is a possibility to use QuickSync and OpenCL in ffmpeg at the same time. Is there any improvement? For example using QuickSync for video encoding (codec h264_qsv) and OpenCL for unsharping (filter unsharp_opencl). Possible? Good or bad idea?

@Brainiarc7

This comment has been minimized.

Copy link
Owner Author

commented Oct 20, 2018

It is possible.

With OpenCL, you can use either an OpenCL device derived from the VAAPI hardware context (see comment above for an example) or initialize the OpenCL device directly (see gist for details) should the former fail.

@kslr

This comment has been minimized.

Copy link

commented Oct 21, 2018

Is it possible to add a GitHub repository? Other people can participate

@Brainiarc7

This comment has been minimized.

Copy link
Owner Author

commented Oct 23, 2018

@kslr,

A github repository for what, documentation?
Sure, it can be done. I can create a repository for all FFmpeg-related configuration gists that people can contribute to, etc.

@galli-leo

This comment has been minimized.

Copy link

commented Oct 26, 2018

@Brainiarc7 Where did you get the tonemap_opencl filter from? When I build ffmpeg it isn't included. Is it only on the latest master? (Mine is 4.0.2).

@Brainiarc7

This comment has been minimized.

Copy link
Owner Author

commented Oct 26, 2018

@galli-leo,

Yes, all the documented builds here are from the master branch, not a release.
Also, did you build from source? Note that OpenCL is not enabled by default, not even by package maintainers.

@galli-leo

This comment has been minimized.

Copy link

commented Oct 29, 2018

@Brainiarc7 Thanks, I found the opencl filter in the master branch and merged it into my fork.

I am trying to do everything with nvdec and nvenc, with the following command:

'/root/FFmpeg/build/bin/ffmpeg_g' -init_hw_device 'cuda=cu' -init_hw_device 'opencl=ocl' \
-codec:v hevc_cuvid '-hwaccel:0' 'cuda' \
'-codec:1' 'dca' \
'-i' "/media/movies/Plex/Movies/Harry Potter and the Prisoner of Azkaban (2004)/Harry Potter and the Prisoner of Azkaban (2004).mkv" \
-filter_hw_device ocl '-filter_complex' "[0:0]scale=w=1920:h=1080[a];[a]hwupload,tonemap_opencl=t=bt2020:tonemap=hable:desat=0:format=nv12[b];[b]hwdownload,format=nv12,hwupload_cuda[c]" \
'-map' "[c]" '-metadata:s:0' 'language=eng' \
'-codec:0' 'h264_nvenc' -b:0 '15000k' -bufsize:0 '150000k' \
'-codec:1' 'copy' \
output.mkv \1
'-loglevel' 'info' -threads 1

I am currently getting around 45fps with one cpu core maxed out. (Adjusting threads does not do anything). GPU is only at 20-30%

Do you have any recommendations on getting this to work faster and more efficient? i.e. would using vaapi help?
I tried using scale_npp instead of scale, but that only works with non hdr pixelformats, which we don't have before the tonemapping.

@Brainiarc7

This comment has been minimized.

Copy link
Owner Author

commented Oct 29, 2018

Hey there,

Could you move the scale video filter after the tone mapping operation? That way, you'll have the non hdr pixel formats AFTER, allowing scale_npp to work.

For performance, tune the h264_nvenc encoder, as explained here.

See this:

'/root/FFmpeg/build/bin/ffmpeg_g' -init_hw_device 'cuda=cu' -init_hw_device 'opencl=ocl' \
-codec:v hevc_cuvid '-hwaccel:0' 'cuda' \
'-codec:1' 'dca' \
'-i' "/media/movies/Plex/Movies/Harry Potter and the Prisoner of Azkaban (2004)/Harry Potter and the Prisoner of Azkaban (2004).mkv" \
-filter_hw_device ocl '-filter_complex' "[0:0]hwupload,tonemap_opencl=t=bt2020:tonemap=hable:desat=0:format=nv12[a];[a]scale_npp=w=1920:h=1080[b];[b]hwdownload,format=nv12,hwupload_cuda[c]" \
'-map' "[c]" '-metadata:s:0' 'language=eng' \
'-codec:0' 'h264_nvenc' -b:0 '15000k' -bufsize:0 '150000k' -preset:0 llhq -profile:0 main -level:0 4.1 -rc:0 ll_2pass_quality -rc-lookahead:0 32 -temporal-aq:0 1 -weighted_pred:0 1 -coder:v cabac \
'-codec:1' 'copy' \
output.mkv \1
'-loglevel' 'info' -threads 1
@galli-leo

This comment has been minimized.

Copy link

commented Oct 30, 2018

@Brainiarc7

Unfortunately, that does not work, as scale_npp still erroring. I am guessing because it doesn't know what to do with the opencl frame?

When I do the following filter, it does not complain, however, the output is just a green image:

[0:0]hwupload,tonemap_opencl=t=bt2020:tonemap=hable:desat=0:format=nv12,hwdownload,format=nv12[a];[a]hwupload_cuda,scale_npp=w=1920:h=1080[b];[b]hwdownload,format=nv12,hwupload_cuda[c]

@Brainiarc7

This comment has been minimized.

Copy link
Owner Author

commented Oct 30, 2018

@galli-leo,

You might want to report that upstream as a bug. That green color is an anomaly.

Can you share the sample file you're using?

@galli-leo

This comment has been minimized.

Copy link

commented Oct 30, 2018

@Brainiarc7 I tried it using the following sample file:
http://files.hdrsamples.com/downloads/hdr/Life_of_%20Pi_draft_Ultra-HD_HDR.mp4

With the following command:

'/root/FFmpeg/build/bin/ffmpeg' -init_hw_device 'cuda=cu' -init_hw_device 'opencl=ocl' \
-codec:v hevc_cuvid '-hwaccel:0' 'cuda' \
'-codec:1' 'dca' \
'-i' "lifepi.mp4" \
-filter_hw_device ocl '-filter_complex' "[0:0]hwupload,tonemap_opencl=t=bt2020:tonemap=hable:desat=0:format=nv12[a];[a]hwdownload,format=nv12,hwupload_cuda,scale_npp=w=1920:h=1080[c]" \
'-map' "[c]" '-metadata:s:0' 'language=eng' \
'-codec:0' 'h264_nvenc' -b:0 '15000k' -bufsize:0 '150000k' -preset:0 llhq \
'-codec:1' 'copy' \
lifepi_out.mkv \
'-loglevel' 'info' -threads 1

Which results in the following output file: https://galli.me/transcoding/lifepi_out.mkv

I think that the issue is the second format=nv12.

Thanks a lot for helping out! Didn't find much on it elsewhere in the internet :)

@Brainiarc7

This comment has been minimized.

Copy link
Owner Author

commented Oct 30, 2018

You're welcome.

I'll take a look at the samples provided.

@devalexqt

This comment has been minimized.

Copy link

commented Nov 3, 2018

Hi!
What is mean:
-init_hw_device opencl=ocl@va
Because I get error:

Device creation failed: -12. Failed to set value 'opencl=ocl@va' for option 'init_hw_device'

@devalexqt

This comment has been minimized.

Copy link

commented Nov 4, 2018

Then I reboot PC without monitor to be connected (like on server) and run command over shh then I get less fps (converting speed).
Build with instructions: https://gist.github.com/Brainiarc7/4f831867f8e55d35cbcb527e15f9f116

Tested on i7- 6700 and on i7- 8700 CPU on Ubuntu 18.04 with ffmpeg.

ffmpeg -hwaccel qsv -c:v h264_qsv -i 720p.mp4 -c:v h264_qsv -preset 7 -b:v 6M -vf scale_qsv=1920:1080 -an -y h264.mp4
Monitor connected : fps: 330fps
Monitor not connected: 250fps
Then I check GPU clock via "sudo intel_gpu_frequency" it jump to max 1200 MHz.

Any idea why?

@Brainiarc7

This comment has been minimized.

Copy link
Owner Author

commented Nov 4, 2018

Hello @devalexqt,

For your first question, you must build both libdrm and libva before the intel media SDK. This will allow you to access the cl_intel_va_api_media_sharing opencl extension that you can confirm by running clinfo. Thanks for noticing this issue.

The documentation has been updated.

For now, you can still initialize the first OpenCL hwaccel device from the first platform directly (bypassing the extension above) with this template:

ffmpeg -init_hw_device vaapi=va:/dev/dri/renderD128 -init_hw_device \
opencl=opencl:0.1 -hwaccel vaapi -hwaccel_device va -hwaccel_output_format \
vaapi -i INPUT -filter_hw_device opencl -filter_complex \
'[0:v]hwmap,tonemap_opencl=t=bt2020:tonemap=linear:format=p010[x1]; \
[x1]hwmap=derive_device=vaapi:reverse=1' -c:v hevc_vaapi -profile 2 OUTPUT

Also, ensure that FFmpeg is build with the OpenCL enablement flag as shown above.

On the second question: That's expected. DRM node access is faster in X than without.

Workaround: Set the SETCAP capability flag for your FFmpeg binary and retest:

sudo setcap cap_sys_admin+ep /path/to/ffmpeg

Then report back with findings.

@devalexqt

This comment has been minimized.

Copy link

commented Nov 5, 2018

Hi!
SETCAP don't help to speed up. (I'm testing ffmpeg as root)
Now, try to rebuild with cl_intel_va_api_media_sharing

@devalexqt

This comment has been minimized.

Copy link

commented Nov 5, 2018

! typo in doc, libdrm not have ./configure must be ./autogen.sh I think.

No luck!
Then try to run command:
ffmpeg -init_hw_device vaapi=va:/dev/dri/renderD128 -init_hw_device opencl=ocl:0.0 -hwaccel vaapi -hwaccel_device va -hwaccel_output_format vaapi -i 720p.mp4 -filter_hw_device ocl -filter_complex '[0:v]hwmap,tonemap_opencl=t=bt2020:tonemap=linear:format=p010[x1];[x1]hwmap=derive_device=vaapi:reverse=1' -c:v hevc_vaapi -profile 2 -an -y out.mp4
I get error:
Stream mapping: Stream #0:0 (h264) -> hwmap hwmap -> Stream #0:0 (hevc_vaapi) Press [q] to stop, [?] for help [Parsed_hwmap_2 @ 0x556a0a430c80] Failed to created derived device context: -38. [Parsed_hwmap_2 @ 0x556a0a430c80] Failed to configure output pad on Parsed_hwmap_2 [AVHWFramesContext @ 0x556a0a436540] Failed to release frame command queue: -36. Error reinitializing filters! Failed to inject frame into filter network: Function not implemented Error while processing the decoded data for stream #0:0

@devalexqt

This comment has been minimized.

Copy link

commented Nov 5, 2018

Also, after rebuild clinfo have cl_intel_va_api_media_sharing:
Device Extensions: .................., cl_intel_va_api_media_sharing

@devalexqt

This comment has been minimized.

Copy link

commented Nov 5, 2018

After update ubuntu kernel to v4.19 problem with less performance without monitor is gone!
Issue from iMSDK: Intel-Media-SDK/MediaSDK#862

But still no luck with OpenCL :(

@Brainiarc7

This comment has been minimized.

Copy link
Owner Author

commented Nov 5, 2018

Cool, thanks. Let me correct the filter above. Seems like a syntax error.

@Brainiarc7

This comment has been minimized.

Copy link
Owner Author

commented Nov 5, 2018

Hello there,

Can you provide the output of:

ffmpeg -hide_banner -v verbose -init_hw_device opencl

Then we can see the OpenCL platform details for the system you're on.

Now that you have the necessary extension configured and built, try this sample (with corrected paths):

ffmpeg -init_hw_device vaapi=va:/dev/dri/renderD128 -init_hw_device \
opencl=ocl@va -hwaccel vaapi -hwaccel_device va -hwaccel_output_format \
vaapi -i INPUT -filter_hw_device ocl -filter_complex \
'[0:v]hwmap,tonemap_opencl=t=bt2020:tonemap=linear:format=p010[x1]; \
[x1]hwmap=derive_device=vaapi:reverse=1' -c:v hevc_vaapi -profile 2 OUTPUT

And report back.

@devalexqt

This comment has been minimized.

Copy link

commented Nov 6, 2018

ffmpeg -hide_banner -v verbose -init_hw_device opencl
[AVHWDeviceContext @ 0x560d50e77140] 0.0: Intel(R) OpenCL HD Graphics / Intel(R) Gen9 HD Graphics NEO

And after run ffmpeg command:

Device creation failed: -38.
Failed to set value 'opencl=ocl@va' for option 'init_hw_device': Function not implemented

if replace opencl=ocl@va to opencl=ocl:0.0 , then get error:

[Parsed_hwmap_2 @ 0x55892681bd80] Failed to created derived device context: -38.
[Parsed_hwmap_2 @ 0x55892681bd80] Failed to configure output pad on Parsed_hwmap_2
[AVHWFramesContext @ 0x558926821680] Failed to release frame command queue: -36.
Error reinitializing filters!
Failed to inject frame into filter network: Function not implemented

@Brainiarc7

This comment has been minimized.

Copy link
Owner Author

commented Nov 6, 2018

Ok, paste the exact command you ran to get this result:

Device creation failed: -38.
Failed to set value 'opencl=ocl@va' for option 'init_hw_device': Function not implemented

Concerning this:

[Parsed_hwmap_2 @ 0x55892681bd80] Failed to created derived device context: -38.
[Parsed_hwmap_2 @ 0x55892681bd80] Failed to configure output pad on Parsed_hwmap_2
[AVHWFramesContext @ 0x558926821680] Failed to release frame command queue: -36.
Error reinitializing filters!
Failed to inject frame into filter network: Function not implemented

That's because we're using an external device to derive an OpenCL context that should've been sourced from the extension above.

@devalexqt

This comment has been minimized.

Copy link

commented Nov 6, 2018

I run your command.

@Brainiarc7

This comment has been minimized.

Copy link
Owner Author

commented Nov 6, 2018

Alright, I see.

I'll submit this ticket shortly.

@devalexqt

This comment has been minimized.

Copy link

commented Nov 6, 2018

Ok.
Wait for solution,,,,
Any way, if You have working version, can you make comparison test: overlay_opencl vs overlay_qsv , please?
For overlay_qsv I'm using command (full hw work):
=>hw decode(h264) => hw scale to 1080p => hw overlay(watermark) => hw encode(h264)
ffmpeg -hwaccel qsv -c:v h264_qsv -i 720p.mp4 -c:v h264_qsv -preset 7 -filter_complex "scale_qsv=1920:1080[s2];movie=watermark.png,scale=1080*0.75:-1,format=nv12,hwupload=extra_hw_frames=30[watermark];[s2][watermark]overlay_qsv=(main_w-overlay_w)/2:(main_h-overlay_h-10)/1:alpha=150[overlay]" -map [overlay] -map a? -c:a copy -b:v 6M watermark.mp4

@ganxiao2008

This comment has been minimized.

Copy link

commented Nov 7, 2018

Hi,
I got errors in build "compute runtime project":
/hw_info.h:9:10: fatal error: gtsysinfo.h: No such file or directory
Because media driver installs igdgmm headers in a directory structure that is different from upstream gmmlib by adding the igdgmm folder under the include folder.
According to intel/compute-runtime#61 (comment) , intel-media-driver git master can allow use of dynamic (installed) gmmlib since September 6, 2018 through commit intel/media-driver@63dd4ae.
So in section 2. Gmmlib, it's better to install gmmlib in the end, so intel-media-driver will not build and install gmmlib files. In this situation, compute runtime project will build fine.
Thanks.

@ganxiao2008

This comment has been minimized.

Copy link

commented Nov 7, 2018

@Brainiarc7
ffmpeg complained about missing libx264
ffmpeg: error while loading shared libraries: libx264.so.155: cannot open shared object file: No such file or directory
Fixed by add lib path to /etc/ld.so.conf

But I have trouble with ffmpeg using vaapi route,
original file is a 4k h264 30fps demo video,tried vaapi and qsv,
vaapi's performance was abnormal while qsv's was good.
Is there something wrong in my commands?

  1. vaapi
ffmpeg -hwaccel vaapi -vaapi_device /dev/dri/renderD128 \
-i 4K-Chimei-inn-2011-1013_30p.mp4 -vf 'format=nv12,hwupload,scale_vaapi=w=1920:h=1080' -map 0:0 -map 0:1 -threads 4 \
-aspect 16:9 -y -f matroska -acodec copy -b:v 5000k -vcodec h264_vaapi \
output.mkv

transcoding worked but performance was very pool,unlike hardware accelerated

ffmpeg version N-92374-gd96ae9d5ea Copyright (c) 2000-2018 the FFmpeg developers
  built with gcc 7 (Ubuntu 7.3.0-27ubuntu1~18.04)
  configuration: --pkg-config-flags=--static --prefix=/home/ganxiao/bin --bindir=/home/ganxiao/bin --extra-cflags=-I/home/ganxiao/bin/include --extra-ldflags=-L/home/ganxiao/bin/lib --extra-cflags=-I/opt/intel/mediasdk/include --extra-ldflags=-L/opt/intel/mediasdk/lib --extra-ldflags=-L/opt/intel/mediasdk/plugins --enable-libmfx --enable-vaapi --enable-opencl --disable-debug --enable-libvorbis --enable-libvpx --enable-libdrm --enable-gpl --cpu=native --enable-libfdk-aac --enable-libx264 --enable-libx265 --extra-libs=-lpthread --enable-nonfree
  libavutil      56. 23.101 / 56. 23.101
  libavcodec     58. 39.100 / 58. 39.100
  libavformat    58. 22.100 / 58. 22.100
  libavdevice    58.  6.100 / 58.  6.100
  libavfilter     7. 43.100 /  7. 43.100
  libswscale      5.  4.100 /  5.  4.100
  libswresample   3.  4.100 /  3.  4.100
  libpostproc    55.  4.100 / 55.  4.100
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55d18984be40] st: 0 edit list: 1 Missing key frame while searching for timestamp: 0
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55d18984be40] st: 0 edit list 1 Cannot find an index entry before timestamp: 0.
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '4K-Chimei-inn-2011-1013_30p.mp4':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: mp42mp41
    creation_time   : 2011-10-13T07:30:26.000000Z
  Duration: 00:03:41.99, start: 0.000000, bitrate: 32320 kb/s
    Stream #0:0(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709), 3840x2160 [SAR 1:1 DAR 16:9], 31998 kb/s, 29.97 fps, 29.97 tbr, 29970 tbn, 59.94 tbc (default)
    Metadata:
      creation_time   : 2011-10-13T07:30:26.000000Z
      handler_name    : Mainconcept MP4 Video Media Handler
      encoder         : AVC Coding
    Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 317 kb/s (default)
    Metadata:
      creation_time   : 2011-10-13T07:30:26.000000Z
      handler_name    : Mainconcept MP4 Sound Media Handler
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> h264 (h264_vaapi))
  Stream #0:1 -> #0:1 (copy)
Press [q] to stop, [?] for help
Output #0, matroska, to 'output.mkv':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: mp42mp41
    encoder         : Lavf58.22.100
    Stream #0:0(eng): Video: h264 (h264_vaapi) (High) (H264 / 0x34363248), vaapi_vld, 1920x1080 [SAR 1:1 DAR 16:9], q=-1--1, 5000 kb/s, 29.97 fps, 1k tbn, 29.97 tbc (default)
    Metadata:
      creation_time   : 2011-10-13T07:30:26.000000Z
      handler_name    : Mainconcept MP4 Video Media Handler
      encoder         : Lavc58.39.100 h264_vaapi
    Stream #0:1(eng): Audio: aac (LC) ([255][0][0][0] / 0x00FF), 48000 Hz, stereo, fltp, 317 kb/s (default)
    Metadata:
      creation_time   : 2011-10-13T07:30:26.000000Z
      handler_name    : Mainconcept MP4 Sound Media Handler
^rame=   98 fps=2.6 q=-0.0 size=       1kB time=00:00:03.98 bitrate=   2.3kbits/s speed=0.106x 
  1. qsv
ffmpeg -c:v h264_qsv -hwaccel qsv \
-i 4K-Chimei-inn-2011-1013_30p.mp4 -vf 'hwupload,scale_qsv=1920:1080:format=nv12' -map 0:0 -map 0:1 -threads 4 \
-aspect 16:9 -y -f matroska -acodec copy -b:v 5000k -vcodec h264_qsv \
output.mkv

transcoding worked with decent performance

ffmpeg version N-92374-gd96ae9d5ea Copyright (c) 2000-2018 the FFmpeg developers
  built with gcc 7 (Ubuntu 7.3.0-27ubuntu1~18.04)
  configuration: --pkg-config-flags=--static --prefix=/home/ganxiao/bin --bindir=/home/ganxiao/bin --extra-cflags=-I/home/ganxiao/bin/include --extra-ldflags=-L/home/ganxiao/bin/lib --extra-cflags=-I/opt/intel/mediasdk/include --extra-ldflags=-L/opt/intel/mediasdk/lib --extra-ldflags=-L/opt/intel/mediasdk/plugins --enable-libmfx --enable-vaapi --enable-opencl --disable-debug --enable-libvorbis --enable-libvpx --enable-libdrm --enable-gpl --cpu=native --enable-libfdk-aac --enable-libx264 --enable-libx265 --extra-libs=-lpthread --enable-nonfree
  libavutil      56. 23.101 / 56. 23.101
  libavcodec     58. 39.100 / 58. 39.100
  libavformat    58. 22.100 / 58. 22.100
  libavdevice    58.  6.100 / 58.  6.100
  libavfilter     7. 43.100 /  7. 43.100
  libswscale      5.  4.100 /  5.  4.100
  libswresample   3.  4.100 /  3.  4.100
  libpostproc    55.  4.100 / 55.  4.100
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55cb2b5855c0] st: 0 edit list: 1 Missing key frame while searching for timestamp: 0
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55cb2b5855c0] st: 0 edit list 1 Cannot find an index entry before timestamp: 0.
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '4K-Chimei-inn-2011-1013_30p.mp4':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: mp42mp41
    creation_time   : 2011-10-13T07:30:26.000000Z
  Duration: 00:03:41.99, start: 0.000000, bitrate: 32320 kb/s
    Stream #0:0(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709), 3840x2160 [SAR 1:1 DAR 16:9], 31998 kb/s, 29.97 fps, 29.97 tbr, 29970 tbn, 59.94 tbc (default)
    Metadata:
      creation_time   : 2011-10-13T07:30:26.000000Z
      handler_name    : Mainconcept MP4 Video Media Handler
      encoder         : AVC Coding
    Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 317 kb/s (default)
    Metadata:
      creation_time   : 2011-10-13T07:30:26.000000Z
      handler_name    : Mainconcept MP4 Sound Media Handler
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (h264_qsv) -> h264 (h264_qsv))
  Stream #0:1 -> #0:1 (copy)
Press [q] to stop, [?] for help
Output #0, matroska, to 'output.mkv':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: mp42mp41
    encoder         : Lavf58.22.100
    Stream #0:0(eng): Video: h264 (h264_qsv) (H264 / 0x34363248), qsv, 1920x1080 [SAR 1:1 DAR 16:9], q=-1--1, 5000 kb/s, 29.97 fps, 1k tbn, 29.97 tbc (default)
    Metadata:
      creation_time   : 2011-10-13T07:30:26.000000Z
      handler_name    : Mainconcept MP4 Video Media Handler
      encoder         : Lavc58.39.100 h264_qsv
    Side data:
      cpb: bitrate max/min/avg: 0/0/5000000 buffer size: 0 vbv_delay: -1
    Stream #0:1(eng): Audio: aac (LC) ([255][0][0][0] / 0x00FF), 48000 Hz, stereo, fltp, 317 kb/s (default)
    Metadata:
      creation_time   : 2011-10-13T07:30:26.000000Z
      handler_name    : Mainconcept MP4 Sound Media Handler
frame=  507 fps= 72 q=29.0 Lsize=   10714kB time=00:00:18.04 bitrate=4863.3kbits/s speed=2.55x   
@devalexqt

This comment has been minimized.

Copy link

commented Nov 7, 2018

For better performance you need to avoid using "hwupload" or "hwdownload" because you copy raw frame data to CPU or GPU ram and it's slow and are bottle neck.
Schema must be only hw for more fps:
hw decode => hw filtering => hw encode

@Brainiarc7

This comment has been minimized.

Copy link
Owner Author

commented Nov 8, 2018

Thanks for the analysis and recommendations, @devalexqt.
I will be setting up a testbed for QSV and VAAPI h/w accel testing in a day or two. For now, I'll add the step for installing gmmlib after build step, as you've recommended.

@ganxiao2008

This comment has been minimized.

Copy link

commented Nov 8, 2018

For better performance you need to avoid using "hwupload" or "hwdownload" because you copy raw frame data to CPU or GPU ram and it's slow and are bottle neck.
Schema must be only hw for more fps:
hw decode => hw filtering => hw encode

Thanks for your analysis @devalexqt
But error occured after hwupload removed

ganxiao@J3455:~/Videos/4K$ ffmpeg -hwaccel vaapi -vaapi_device /dev/dri/renderD128 -i 4K-Chimei-inn-2011-1013_30p.mp4 -vf 'format=nv12,scale_vaapi=w=1920:h=1080' -map 0:0 -map 0:1 -threads 4 -aspect 16:9 -y -f matroska -acodec copy -b:v 5000k -vcodec h264_vaapi output.mkv
ffmpeg version N-92374-gd96ae9d5ea Copyright (c) 2000-2018 the FFmpeg developers
  built with gcc 7 (Ubuntu 7.3.0-27ubuntu1~18.04)
  configuration: --pkg-config-flags=--static --prefix=/home/ganxiao/bin --bindir=/home/ganxiao/bin --extra-cflags=-I/home/ganxiao/bin/include --extra-ldflags=-L/home/ganxiao/bin/lib --extra-cflags=-I/opt/intel/mediasdk/include --extra-ldflags=-L/opt/intel/mediasdk/lib --extra-ldflags=-L/opt/intel/mediasdk/plugins --enable-libmfx --enable-vaapi --enable-opencl --disable-debug --enable-libvorbis --enable-libvpx --enable-libdrm --enable-gpl --cpu=native --enable-libfdk-aac --enable-libx264 --enable-libx265 --extra-libs=-lpthread --enable-nonfree
  libavutil      56. 23.101 / 56. 23.101
  libavcodec     58. 39.100 / 58. 39.100
  libavformat    58. 22.100 / 58. 22.100
  libavdevice    58.  6.100 / 58.  6.100
  libavfilter     7. 43.100 /  7. 43.100
  libswscale      5.  4.100 /  5.  4.100
  libswresample   3.  4.100 /  3.  4.100
  libpostproc    55.  4.100 / 55.  4.100
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x556bbf6cae40] st: 0 edit list: 1 Missing key frame while searching for timestamp: 0
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x556bbf6cae40] st: 0 edit list 1 Cannot find an index entry before timestamp: 0.
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '4K-Chimei-inn-2011-1013_30p.mp4':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: mp42mp41
    creation_time   : 2011-10-13T07:30:26.000000Z
  Duration: 00:03:41.99, start: 0.000000, bitrate: 32320 kb/s
    Stream #0:0(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709), 3840x2160 [SAR 1:1 DAR 16:9], 31998 kb/s, 29.97 fps, 29.97 tbr, 29970 tbn, 59.94 tbc (default)
    Metadata:
      creation_time   : 2011-10-13T07:30:26.000000Z
      handler_name    : Mainconcept MP4 Video Media Handler
      encoder         : AVC Coding
    Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 317 kb/s (default)
    Metadata:
      creation_time   : 2011-10-13T07:30:26.000000Z
      handler_name    : Mainconcept MP4 Sound Media Handler
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> h264 (h264_vaapi))
  Stream #0:1 -> #0:1 (copy)
Press [q] to stop, [?] for help
Impossible to convert between the formats supported by the filter 'Parsed_format_0' and the filter 'auto_scaler_0'
Error reinitializing filters!
Failed to inject frame into filter network: Function not implemented
Error while processing the decoded data for stream #0:0
Conversion failed!

And working speed before was merely 0.106x which was too slow even considering raw frame data copy.

@devalexqt

This comment has been minimized.

Copy link

commented Nov 8, 2018

It's impossible to mix hw and sw filters without hwupload/download! For full hw transcode you must use only hw filters! For your command, remove : format=nv12 and -aspect 16:9 and set -hwaccel_output_format vaapi before input file option.

To list all hw filters run command:

for vaapi:
ffmpeg -filters|grep vaapi
... deinterlace_vaapi V->V Deinterlacing of VAAPI surfaces
... denoise_vaapi V->V VAAPI VPP for de-noise
... procamp_vaapi V->V ProcAmp (color balance) adjustments for hue, saturation, brightness, contrast
... scale_vaapi V->V Scale to/from VAAPI surfaces.
... sharpness_vaapi V->V VAAPI VPP for sharpness

for qsv:
ffmpeg -filters|grep qsv
... deinterlace_qsv V->V QuickSync video deinterlacing
... overlay_qsv VV->V Quick Sync Video overlay.
... scale_qsv V->V QuickSync video scaling and format conversion
... vpp_qsv V->V Quick Sync Video VPP.

for opencl:
ffmpeg -filters|grep opencl

... avgblur_opencl V->V Apply average blur filter
... boxblur_opencl V->V Apply boxblur filter to input video
... convolution_opencl V->V Apply convolution mask to input video
... dilation_opencl V->V Apply dilation effect
... erosion_opencl V->V Apply erosion effect
... overlay_opencl VV->V Overlay one video on top of another
... prewitt_opencl V->V Apply prewitt operator
... program_opencl |->V Filter video using an OpenCL program
... roberts_opencl V->V Apply roberts operator
... sobel_opencl V->V Apply sobel operator
... tonemap_opencl V->V perform HDR to SDR conversion with tonemapping
... unsharp_opencl V->V Apply unsharp mask to input video
... openclsrc |->V Generate video using an OpenCL program

Also, to list for available filter params, type:
ffmpeg -h filter=scale_vaapi
scale_vaapi AVOptions:
w ..FV..... Output video width (default "iw")
h ..FV..... Output video height (default "ih")
format ..FV..... Output video format (software format of hardware frames)

For encoder options:
ffmpeg -h encoder=h264_vaapi
And for decoder options:
ffmpeg -h decoder=h264_qsv

So for scale_vaapi we have hw conversion fromat option this is mean in your case:
-vf scale_vaapi=w=1920:h=1080:format=nv12

@devalexqt

This comment has been minimized.

Copy link

commented Nov 8, 2018

Thanks for the analysis and recommendations, @devalexqt.
I will be setting up a testbed for QSV and VAAPI h/w accel testing in a day or two. For now, I'll add the step for installing gmmlib after build step, as you've recommended.

Thanks.

@ganxiao2008

This comment has been minimized.

Copy link

commented Nov 9, 2018

It's impossible to mix hw and sw filters without hwupload/download! For full hw transcode you must use only hw filters! For your command, remove : format=nv12 and -aspect 16:9 and set -hwaccel_output_format vaapi before input file option.

To list all hw filters run command:

for vaapi:
ffmpeg -filters|grep vaapi
... deinterlace_vaapi V->V Deinterlacing of VAAPI surfaces
... denoise_vaapi V->V VAAPI VPP for de-noise
... procamp_vaapi V->V ProcAmp (color balance) adjustments for hue, saturation, brightness, contrast
... scale_vaapi V->V Scale to/from VAAPI surfaces.
... sharpness_vaapi V->V VAAPI VPP for sharpness

for qsv:
ffmpeg -filters|grep qsv
... deinterlace_qsv V->V QuickSync video deinterlacing
... overlay_qsv VV->V Quick Sync Video overlay.
... scale_qsv V->V QuickSync video scaling and format conversion
... vpp_qsv V->V Quick Sync Video VPP.

for opencl:
ffmpeg -filters|grep opencl

... avgblur_opencl V->V Apply average blur filter
... boxblur_opencl V->V Apply boxblur filter to input video
... convolution_opencl V->V Apply convolution mask to input video
... dilation_opencl V->V Apply dilation effect
... erosion_opencl V->V Apply erosion effect
... overlay_opencl VV->V Overlay one video on top of another
... prewitt_opencl V->V Apply prewitt operator
... program_opencl |->V Filter video using an OpenCL program
... roberts_opencl V->V Apply roberts operator
... sobel_opencl V->V Apply sobel operator
... tonemap_opencl V->V perform HDR to SDR conversion with tonemapping
... unsharp_opencl V->V Apply unsharp mask to input video
... openclsrc |->V Generate video using an OpenCL program

Also, to list for available filter params, type:
ffmpeg -h filter=scale_vaapi
scale_vaapi AVOptions:
w ..FV..... Output video width (default "iw")
h ..FV..... Output video height (default "ih")
format ..FV..... Output video format (software format of hardware frames)

For encoder options:
ffmpeg -h encoder=h264_vaapi
And for decoder options:
ffmpeg -h decoder=h264_qsv

So for scale_vaapi we have hw conversion fromat option this is mean in your case:
-vf scale_vaapi=w=1920:h=1080:format=nv12

Thanks very much for these details. @devalexqt.
I fixed 2 options in my commands.

  1. Option -hwaccel_output_format vaapi is a must for transcode as you said, otherwise speed would dropped down to 0.1x.
    with -hwaccel_output_format
    frame= 233 fps= 76 q=-0.0 Lsize= 4583kB time=00:00:08.85 bitrate=4240.5kbits/s speed= 2.9x
    without -hwaccel_output_format
    frame= 8 fps=2.8 q=-0.0 size= 1kB time=00:00:00.32 bitrate= 28.3kbits/s speed=0.111x

  2. Option format=nv12 must be explicit set as format=nv12|vaapi,otherwise following errors genereated

Impossible to convert between the formats supported by the filter 'graph 0 input from stream 0:0' and the filter 'auto_scaler_0'
Error reinitializing filters!
Failed to inject frame into filter network: Function not implemented
Error while processing the decoded data for stream #0:0
  1. option hwupload -aspect 16:9 should have negative impact on transcode performance, but i didn't see any noticeable changes in my case.
  2. Here is the final command i use
ffmpeg -hwaccel vaapi -hwaccel_output_format vaapi -vaapi_device /dev/dri/renderD128 \
-i 4K-Chimei-inn-2011-1013_30p.mp4 -vf "format=nv12|vaapi,hwupload,scale_vaapi=w=1920:h=1080" \
-aspect 16:9 -y -f matroska -acodec copy -b:v 5000k -vcodec h264_vaapi \
output.mkv
@devalexqt

This comment has been minimized.

Copy link

commented Nov 13, 2018

Any news about testing?

@Brainiarc7

This comment has been minimized.

Copy link
Owner Author

commented Nov 16, 2018

Hey there,

No, not yet. Been a bit busy with some embedded development work. Should resume this in about a week or so, sorry for the delay.

@devalexqt

This comment has been minimized.

Copy link

commented Nov 16, 2018

Ok. Wait.

@Brainiarc7

This comment has been minimized.

Copy link
Owner Author

commented Nov 22, 2018

I'll resume testing tonight.

@devalexqt

This comment has been minimized.

Copy link

commented Nov 23, 2018

What are result?

@Brainiarc7

This comment has been minimized.

Copy link
Owner Author

commented Nov 23, 2018

I'm running into a few installation issues with the Intel OpenCL runtime (neo), as reported above.
Yet to look into the OpenCL h/w derivation as documented upstream.

@devalexqt

This comment has been minimized.

Copy link

commented Nov 24, 2018

ok.

@pikassogod

This comment has been minimized.

Copy link

commented Dec 13, 2018

Hello, tell me if there is an opportunity to create a strict picture on the output?
that is, I have a source
Stream #0:0[0xd3], 252, 1/90000: Video: h264 (Main), 4 reference frames ([27][0][0][0] / 0x001B), yuv420p(tv, bt709, left), 1920x1080 (1920x1088) [SAR 1:1 DAR 16:9], 1/50, 25 fps, 50 tbr, 90k tbn, 50 tbc
after coding it becomes so but i need strict dimensions
1280x720
Stream #0:0[0xd3], 224, 1/90000: Video: h264 (Main), 2 reference frames ([27][0][0][0] / 0x001B), yuv420p(left), 1280x720 (1280x736) [SAR 1:1 DAR 16:9], 1/50, 25 fps, 50 tbr, 90k tbn, 50 tbc

@devalexqt

This comment has been minimized.

Copy link

commented Dec 13, 2018

Try with "-aspect" option or resize: "-s 1280x-1"

@JoshuaDoes

This comment has been minimized.

Copy link

commented Dec 18, 2018

Hi,

So I was really excited to get into testing this as I've been needing an FFmpeg build with support for Intel QSV for a while. I'm running Linux Mint 19.1 (based on Ubuntu 18.04) and have an Intel i7-6500U in my laptop.

I've managed to get as far as successfully compiling FFmpeg, however I've encountered one small issue:

joshuadoes@JoshuaX:~/ffmpeg_sources/FFmpeg$ ffmpeg
ffmpeg: error while loading shared libraries: libmfx.so.1: cannot open shared object file: No such file or directory

I'm unsure as to how to go about fixing this issue personally, as everything before finally testing FFmpeg was a success with no errors reported. Any help would be much appreciated.

Sidenote: Here's the listing of /opt/intel/mediasdk/lib to show that the library does indeed exist.

joshuadoes@JoshuaX:~/ffmpeg_sources/FFmpeg$ ls -la /opt/intel/mediasdk/lib
total 40044
drwxr-xr-x 4 root root     4096 Dec 18 00:34 .
drwxr-xr-x 5 root root     4096 Dec 18 00:34 ..
lrwxrwxrwx 1 root root       15 Dec 18 00:34 libmfxhw64.so -> libmfxhw64.so.1
lrwxrwxrwx 1 root root       18 Dec 18 00:34 libmfxhw64.so.1 -> libmfxhw64.so.1.28
-rw-r--r-- 1 root root 40517248 Dec 18 00:33 libmfxhw64.so.1.28
lrwxrwxrwx 1 root root       11 Dec 18 00:34 libmfx.so -> libmfx.so.1
lrwxrwxrwx 1 root root       14 Dec 18 00:34 libmfx.so.1 -> libmfx.so.1.28
-rw-r--r-- 1 root root   470776 Dec 18 00:29 libmfx.so.1.28
drwxr-xr-x 2 root root     4096 Dec 18 00:34 mfx
drwxr-xr-x 2 root root     4096 Dec 18 00:34 pkgconfig

EDIT: Seems I accidentally skipped the /etc/ld.so.conf.d/imsdk.conf line when comparing my BASH history with the steps, resolved that part easily. But now I'm with another missing library issue:

joshuadoes@JoshuaX:~/ffmpeg_sources/x264$ ffmpeg
ffmpeg: error while loading shared libraries: libx264.so.155: cannot open shared object file: No such file or directory

It's looking for 155 but it would appear that I only have 152 installed. Would it be safe to make a symlink?

@devalexqt

This comment has been minimized.

Copy link

commented Dec 19, 2018

Try to build&install x264 lib from source.

@Brainiarc7

This comment has been minimized.

Copy link
Owner Author

commented Dec 20, 2018

Edited: See the comment above/below.

@Brainiarc7

This comment has been minimized.

Copy link
Owner Author

commented Dec 20, 2018

@JoshuaDoes,

The updated documentation above now builds x264 as a static package. You will not need the workaround previously posted above.
Sorry for the inconvenience caused.

@galli-leo

This comment has been minimized.

Copy link

commented Dec 31, 2018

@Brainiarc7

Did you have time to look at my issue?

@Brainiarc7

This comment has been minimized.

Copy link
Owner Author

commented Dec 31, 2018

@galli-leo,

Yes. I'll update this post with my findings in a few hours.

Warm regards,

Dennis.

@Brainiarc7

This comment has been minimized.

Copy link
Owner Author

commented Jan 12, 2019

@galli-leo,

I have a solution to the problem. However, I need to confirm this with you before posting an update.
Is there an email address I can reach you with? You can email me at the address shown on my profile.

@Brainiarc7

This comment has been minimized.

Copy link
Owner Author

commented Jan 12, 2019

@galli-leo,

The solution: Change the scaler from scale_npp to scale_cuda, as shown in the example below.

A few differences: I had to initialize my OpenCL device(s) differently as I'm on an Optimus laptop, so change that part before you run:

ffmpeg -init_hw_device cuda=cu -init_hw_device opencl=ocl:0.0 \
-hwaccel cuda \
-codec:1 dca \
-i life0.mp4 \
-filter_hw_device ocl -filter_complex "[0:0]hwupload,tonemap_opencl=t=bt2020:tonemap=hable:desat=0:format=nv12[a];[a]hwdownload,format=nv12,hwupload_cuda,scale_cuda=w=1920:h=1080[c]" \
-map "[c]" -metadata:s:0 language=eng \
-codec:0 h264_nvenc -b:0 15000k -bufsize:0 150000k -preset:0 llhq \
-codec:1 copy \
-f matroska -y life0.mkv 

My video, when produced with that scale_cuda filter, has no green coloring or banding. Confirm that's the case on your end.

@ekzkgz

This comment has been minimized.

Copy link

commented Jan 25, 2019

on
cmake -DIGC_OPTION__OUTPUT_DIR=../igc-install/Release ../igc/IGC
have this error

-- Trigger common clang compilation from /home/admusr/intel-compute-runtime/workspace/common_clang to /home/admusr/intel-compute-runtime/workspace/igc/igc-install/Release/clang/linux-ubuntu64
CMake Error at /home/admusr/intel-compute-runtime/workspace/common_clang/CMakeLists.txt:53 (include):
  include could not find load file:

    AddLLVM


CMake Error at /home/admusr/intel-compute-runtime/workspace/common_clang/CMakeLists.txt:54 (include):
  include could not find load file:

    TableGen


-- Found Git: /usr/bin/git (found version "2.17.1")
-- No patches in /home/admusr/intel-compute-runtime/workspace/common_clang/patches/clang
-- No patches in /home/admusr/intel-compute-runtime/workspace/common_clang/patches/spirv
CMake Error at /home/admusr/intel-compute-runtime/workspace/build_igc/llvm/src/cmake/modules/LLVM-Config.cmake:31 (list):
  list sub-command REMOVE_ITEM requires two or more arguments.
Call Stack (most recent call first):
  /home/admusr/intel-compute-runtime/workspace/build_igc/llvm/src/cmake/modules/LLVM-Config.cmake:253 (is_llvm_target_library)
  /home/admusr/intel-compute-runtime/workspace/build_igc/llvm/src/cmake/modules/AddLLVM.cmake:544 (llvm_map_components_to_libnames)
  /home/admusr/intel-compute-runtime/workspace/build_igc/llvm/src/cmake/modules/AddLLVM.cmake:620 (llvm_add_library)
  /home/admusr/intel-compute-runtime/workspace/common_clang/CMakeLists.txt:231 (add_llvm_library)


-- Configuring Intel Gen Assembler (IGA) Component
--  - GED_BRANCH:           GED_external
--  - CMAKE_CXX_COMPILER:   /usr/bin/c++
-- Found BISON: /usr/bin/bison (found version "3.0.4")
-- Found FLEX: /usr/bin/flex (found version "2.6.4")
[check-igc] LIT tests disabled. Missing igc_opt target.
-- Configuring incomplete, errors occurred!
See also "/home/admusr/intel-compute-runtime/workspace/build_igc/CMakeFiles/CMakeOutput.log".
See also "/home/admusr/intel-compute-runtime/workspace/build_igc/CMakeFiles/CMakeError.log".
@galli-leo

This comment has been minimized.

Copy link

commented Jan 26, 2019

@Brainiarc7 Confirmed to be working with the 4.1 tag! Thanks a lot (and sorry for the slow response). Do you think it would be possible to get hwmap working somehow? This would hopefully speed up the process more. As I currently understand it, the frame still needs to go to the cpu to be converted from opencl to cuda, right?

@andersc

This comment has been minimized.

Copy link

commented Jan 30, 2019

I get the same error.. On a fresh installed 18.04.1 machine.. Any solution to the below?

on
cmake -DIGC_OPTION__OUTPUT_DIR=../igc-install/Release ../igc/IGC
have this error

-- Trigger common clang compilation from /home/admusr/intel-compute-runtime/workspace/common_clang to /home/admusr/intel-compute-runtime/workspace/igc/igc-install/Release/clang/linux-ubuntu64
CMake Error at /home/admusr/intel-compute-runtime/workspace/common_clang/CMakeLists.txt:53 (include):
  include could not find load file:

    AddLLVM


CMake Error at /home/admusr/intel-compute-runtime/workspace/common_clang/CMakeLists.txt:54 (include):
  include could not find load file:

    TableGen


-- Found Git: /usr/bin/git (found version "2.17.1")
-- No patches in /home/admusr/intel-compute-runtime/workspace/common_clang/patches/clang
-- No patches in /home/admusr/intel-compute-runtime/workspace/common_clang/patches/spirv
CMake Error at /home/admusr/intel-compute-runtime/workspace/build_igc/llvm/src/cmake/modules/LLVM-Config.cmake:31 (list):
  list sub-command REMOVE_ITEM requires two or more arguments.
Call Stack (most recent call first):
  /home/admusr/intel-compute-runtime/workspace/build_igc/llvm/src/cmake/modules/LLVM-Config.cmake:253 (is_llvm_target_library)
  /home/admusr/intel-compute-runtime/workspace/build_igc/llvm/src/cmake/modules/AddLLVM.cmake:544 (llvm_map_components_to_libnames)
  /home/admusr/intel-compute-runtime/workspace/build_igc/llvm/src/cmake/modules/AddLLVM.cmake:620 (llvm_add_library)
  /home/admusr/intel-compute-runtime/workspace/common_clang/CMakeLists.txt:231 (add_llvm_library)


-- Configuring Intel Gen Assembler (IGA) Component
--  - GED_BRANCH:           GED_external
--  - CMAKE_CXX_COMPILER:   /usr/bin/c++
-- Found BISON: /usr/bin/bison (found version "3.0.4")
-- Found FLEX: /usr/bin/flex (found version "2.6.4")
[check-igc] LIT tests disabled. Missing igc_opt target.
-- Configuring incomplete, errors occurred!
See also "/home/admusr/intel-compute-runtime/workspace/build_igc/CMakeFiles/CMakeOutput.log".
See also "/home/admusr/intel-compute-runtime/workspace/build_igc/CMakeFiles/CMakeError.log".
@Brainiarc7

This comment has been minimized.

Copy link
Owner Author

commented Jan 31, 2019

@andersc and @ekzkgz,

I've updated the gist with a fix for the error you're encountering.

Part of the Intel Graphics Compiler build step is fetching opencl-clang, which should be set to the branch ocl-open-40:

git clone https://github.com/intel/llvm-patches llvm_patches
@Brainiarc7

This comment has been minimized.

Copy link
Owner Author

commented Jan 31, 2019

@galli-leo,

I'll take a look at hwmap and document it here.
Thanks for the suggestion!

@Brainiarc7

This comment has been minimized.

Copy link
Owner Author

commented Mar 3, 2019

Note: Purge clang-4.0 and remove it from the system path.

@janosch1337

This comment has been minimized.

Copy link

commented Mar 12, 2019

hey,

cmake ../igc/IGC

-->

-- Precompiled common clang bundle /root/intel-compute-runtime/workspace/igc/IGC/../Clang/Prebuilt/linux-ubuntu/Release/64/clang.7z does not exist. Try to compile from sources or use system common clang.
Common clang build-in-tree

what is the problem?

@Brainiarc7

This comment has been minimized.

Copy link
Owner Author

commented Mar 13, 2019

@janosch1337.

You must've skipped one or more of the git clone steps.
Specifically:

git clone -b release_70 https://github.com/llvm-mirror/clang llvm_source/tools/clang

Check your git cloning steps carefully.

@mitchins

This comment has been minimized.

Copy link

commented May 3, 2019

I found that I had to install ffmpeg into /opt/ffmpeg such that there would be /opt/ffmpeg/lib/libx264.so
It seemed weird to have to add ~/ffmpeg_build/lib/ to the ld cache.
If I didn't do this it complained about the missing libx264.so.157

@Razgrizz

This comment has been minimized.

Copy link

commented May 24, 2019

Hi !

I'm on ubuntu 18.04 and i got this error :

BiFModule/CMakeFiles/BiFModuleOcl.dir/build.make:2265: recipe for target 'Release/bif/IBiF_PreRelease_int.bc_IBIF_PreRelease_Impl__cl__0.bc.tmp' failed
make[2]: *** [Release/bif/IBiF_PreRelease_int.bc_IBIF_PreRelease_Impl__cl__0.bc.tmp] Error 1
make[2]: Leaving directory '/home/technique/intel-compute-runtime/workspace/build_igc'
CMakeFiles/Makefile2:24769: recipe for target 'BiFModule/CMakeFiles/BiFModuleOcl.dir/all' failed
make[1]: *** [BiFModule/CMakeFiles/BiFModuleOcl.dir/all] Error 2
make[1]: Leaving directory '/home/technique/intel-compute-runtime/workspace/build_igc'
Makefile:151: recipe for target 'all' failed
make: *** [all] Error 2

i truncate the error display because it is too long

when building IGC :

cd ~/intel-compute-runtime/workspace/build_igc
cmake ../igc/IGC
time make -j1 VERBOSE=1 # at this step

Can you help me ?

@Brainiarc7

This comment has been minimized.

Copy link
Owner Author

commented May 27, 2019

Hmm, @Razgrizz,

Use all threads where possible:

cd ~/intel-compute-runtime/workspace/build_igc
cmake ../igc/IGC
time make -j$(nproc) VERBOSE=1

It was due to a bug that's fixed upstream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.