Running gstreamer 1.17 built through gst-build (gst-plugins-bad@b5a28df0f312f6e603d0c729f9f799a25a1f0a87
)
- Low res (320x420), using host memory is faster than using GL memory.
- High res (3840x2160), using GL memory is faster than using host memory.
nvprof
measurements of memory copy instructions don't explain that behavior.
# gst-discoverer-1.0 low_res.ts
Properties:
Duration: 0:09:59.997352000
Seekable: yes
Live: no
container: MPEG-2 Transport Stream
video: H.264 (Main Profile)
Stream ID: 0332177a9fc52fecf7b4f60fea697c811a71987e037b0ae394d0273ffbc8abca:1/00000041
Width: 320
Height: 240
Depth: 24
Frame rate: 30/1
Pixel aspect ratio: 1/1
Interlaced: false
Bitrate: 0
Max bitrate: 0
Execution time: 0:00:11.252269640
gst-launch-1.0 filesrc location=low_res.ts ! tsdemux ! h264parse ! nvh264dec ! "video/x-raw" ! nvh264enc ! fakesink
Type Time(%) Time Calls Avg Min Max Name
GPU activities: 49.35% 368.89ms 36000 10.246us 5.9200us 19.872us [CUDA memcpy HtoD]
40.12% 299.85ms 36000 8.3290us 5.3120us 24.256us [CUDA memcpy DtoH]
6.34% 47.353ms 36000 1.3150us 1.2150us 9.3440us Convert_PL2BL
4.19% 31.291ms 18000 1.7380us 1.6640us 2.2400us ConvertNV24toNV12
0.01% 77.632us 68 1.1410us 704ns 2.6240us [CUDA memset]
Execution time: 0:00:20.584277338
gst-launch-1.0 filesrc location=low_res.ts ! tsdemux ! h264parse ! nvh264dec ! "video/x-raw(memory:GLMemory)" ! nvh264enc ! fakesink
Type Time(%) Time Calls Avg Min Max Name
GPU activities: 50.84% 79.086ms 72000 1.0980us 864ns 14.208us [CUDA memcpy DtoD]
27.69% 43.070ms 36000 1.1960us 991ns 14.016us Convert_PL2BL
21.41% 33.308ms 18000 1.8500us 1.5030us 2.8480us ConvertNV24toNV12
0.05% 78.944us 68 1.1600us 672ns 2.6560us [CUDA memset]
# gst-discoverer-1.0 hi_res.ts
Properties:
Duration: 0:09:59.998824481
Seekable: yes
Live: no
container: MPEG-2 Transport Stream
video: H.264 (Main Profile)
Stream ID: 431405b912470c7752b10402f2c5e9e93da618a677918195c637c8d0371e5414:1/00000041
Width: 3840
Height: 2160
Depth: 24
Frame rate: 30/1
Pixel aspect ratio: 1/1
Interlaced: false
Bitrate: 0
Max bitrate: 0
Execution time: 0:03:20.462018560
gst-launch-1.0 filesrc location=hi_res.ts ! tsdemux ! h264parse ! nvh264dec ! "video/x-raw" ! nvh264enc ! fakesink
Type Time(%) Time Calls Avg Min Max Name
GPU activities: 54.40% 46.3980s 36000 1.2888ms 738.27us 2.7441ms [CUDA memcpy HtoD]
42.74% 36.4568s 36000 1.0127ms 599.52us 3.0454ms [CUDA memcpy DtoH]
1.47% 1.25313s 18000 69.618us 67.584us 72.192us ConvertNV24toNV12
1.39% 1.18157s 36000 32.821us 23.328us 45.856us Convert_PL2BL
0.00% 81.504us 66 1.2340us 704ns 2.6560us [CUDA memset]
Execution time: 0:02:18.106101429
gst-launch-1.0 filesrc location=hi_res.ts ! tsdemux ! h264parse ! nvh264dec ! "video/x-raw(memory:GLMemory)" ! nvh264enc ! fakesink
Type Time(%) Time Calls Avg Min Max Name
GPU activities: 50.48% 2.63958s 72000 36.660us 22.976us 58.976us [CUDA memcpy DtoD]
25.11% 1.31285s 36000 36.468us 23.744us 49.024us Convert_PL2BL
24.41% 1.27668s 18000 70.926us 67.585us 71.872us ConvertNV24toNV12
0.00% 81.536us 66 1.2350us 704ns 2.9120us [CUDA memset]