Skip to content

Instantly share code, notes, and snippets.

@nisseknudsen
Last active September 11, 2025 20:13
Show Gist options
  • Select an option

  • Save nisseknudsen/2a020b7e9edba04d39046dca039d4ba2 to your computer and use it in GitHub Desktop.

Select an option

Save nisseknudsen/2a020b7e9edba04d39046dca039d4ba2 to your computer and use it in GitHub Desktop.
Appendix: Benchmark Reproduction Commands - Why Offloading Video Decode to Integrated GPUs Matters for Edge AI

1. RAW Decode (Full frame rate, full resolution)

Software (CPU only):

docker run linuxserver/ffmpeg:7.1.1 \
  -rtsp_transport tcp -timeout 5000000 \
  -allowed_media_types video \
  -flags low_delay \
  -fflags +discardcorrupt \
  -an \
  -t 30 \
  -i "rtsp://<your_camera_stream>" \
  -pix_fmt yuv420p \
  -map 0:v:0 -f null - -benchmark -stats

Hardware-accelerated (iGPU QuickSync):

docker run --device /dev/dri/renderD128 linuxserver/ffmpeg:7.1.1 \
  -hwaccel qsv \
  -hwaccel_output_format qsv \
  -rtsp_transport tcp -timeout 5000000 \
  -allowed_media_types video \
  -flags low_delay \
  -fflags +discardcorrupt \
  -an \
  -t 30 \
  -i "rtsp://<your_camera_stream>" \
  -vf "hwdownload,format=nv12" -pix_fmt nv12 \
  -map 0:v:0 -f null - -benchmark -stats

2. Subsampled Decode (2 FPS frame drop)

Software:

docker run linuxserver/ffmpeg:7.1.1 \
  -rtsp_transport tcp -timeout 5000000 \
  -allowed_media_types video \
  -flags low_delay \
  -fflags +discardcorrupt \
  -an \
  -t 30 \
  -i "rtsp://<your_camera_stream>" \
  -vf "fps=2" -pix_fmt yuv420p \
  -map 0:v:0 -f null - -benchmark -stats

Hardware-accelerated:

docker run --device /dev/dri/renderD128 linuxserver/ffmpeg:7.1.1 \
  -hwaccel qsv \
  -hwaccel_output_format qsv \
  -rtsp_transport tcp -timeout 5000000 \
  -allowed_media_types video \
  -flags low_delay \
  -fflags +discardcorrupt \
  -an \
  -t 30 \
  -i "rtsp://<your_camera_stream>" \
  -vf "vpp_qsv=framerate=2:format=nv12,hwdownload,format=nv12" -pix_fmt nv12 \
  -map 0:v:0 -f null - -benchmark -stats

3. Rescaled Decode (4K→960×540 spatial resize)

Software:

docker run linuxserver/ffmpeg:7.1.1 \
  -rtsp_transport tcp -timeout 5000000 \
  -allowed_media_types video \
  -flags low_delay \
  -fflags +discardcorrupt \
  -an \
  -t 30 \
  -i "rtsp://<your_camera_stream>" \
  -vf "scale=960:540" -pix_fmt yuv420p \
  -map 0:v:0 -f null - -benchmark -stats

Hardware-accelerated:

docker run --device /dev/dri/renderD128 linuxserver/ffmpeg:7.1.1 \
  -hwaccel qsv \
  -hwaccel_output_format qsv \
  -rtsp_transport tcp -timeout 5000000 \
  -allowed_media_types video \
  -flags low_delay \
  -fflags +discardcorrupt \
  -an \
  -t 30 \
  -i "rtsp://<your_camera_stream>" \
  -vf "vpp_qsv=w=960:h=540:format=nv12,hwdownload,format=nv12" -pix_fmt nv12 \
  -map 0:v:0 -f null - -benchmark -stats

4. Rescaled + Subsampled (2 FPS + 960×540 resize)

Software:

docker run linuxserver/ffmpeg:7.1.1 \
  -rtsp_transport tcp -timeout 5000000 \
  -allowed_media_types video \
  -flags low_delay \
  -fflags +discardcorrupt \
  -an \
  -t 30 \
  -i "rtsp://<your_camera_stream>" \
  -vf "fps=2,scale=960:540" -pix_fmt yuv420p \
  -map 0:v:0 -f null - -benchmark -stats

Hardware-accelerated:

docker run --device /dev/dri/renderD128 linuxserver/ffmpeg:7.1.1 \
  -hwaccel qsv \
  -hwaccel_output_format qsv \
  -rtsp_transport tcp -timeout 5000000 \
  -allowed_media_types video \
  -flags low_delay \
  -fflags +discardcorrupt \
  -an \
  -t 30 \
  -i "rtsp://<your_camera_stream>" \
  -vf "vpp_qsv=framerate=2:w=960:h=540:format=nv12,hwdownload,format=nv12" -pix_fmt nv12 \
  -map 0:v:0 -f null - -benchmark -stats

Key Command Details

Hardware acceleration flags:

  • --device /dev/dri/renderD128: Exposes Intel iGPU to Docker container
  • -hwaccel qsv: Enables Intel QuickSync hardware decoding
  • -hwaccel_output_format qsv: Keeps decoded frames on GPU memory
  • vpp_qsv: Hardware video processing (scaling, frame rate conversion)
  • hwdownload,format=nv12: Downloads from GPU to CPU memory for measurement

Software processing:

  • Uses CPU-only libavcodec decoder and libswscale for resizing
  • fps=2: CPU-based frame rate reduction (decodes all frames, then drops)
  • scale=960:540: CPU-based spatial scaling

Common parameters:

  • -rtsp_transport tcp: Reliable transport for IP camera streams
  • -t 30: Run for exactly 30 seconds for consistent measurement
  • -f null -: Discard output (pure processing benchmark)
  • -benchmark -stats: Enable FFmpeg's built-in performance reporting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment