Skip to content

Instantly share code, notes, and snippets.

@colemar
Last active January 22, 2023 23:37
Show Gist options
  • Save colemar/ec9649ee46ea4b0b1dc00b68ea02810e to your computer and use it in GitHub Desktop.
Save colemar/ec9649ee46ea4b0b1dc00b68ea02810e to your computer and use it in GitHub Desktop.
Extract key frames and scenecuts from video

Extract key frames and scenecuts from video

  Usage: videoframes [OPTION] FILE
  Extract key frames and scenecuts from video FILE.

  -k  key frames only
  -s  scenecuts only

  Creates jpg pictures containing a mosaic of thumbnails of the selected frames.
  Each thumbnail is overlayed with a box [T SCORE:TH N TIMESTAMP], where:
  T         frame type: K I P B (every K frame is also a I frame)
  SCORE     scene cut detection score: d.ddd
  TH        scene cut detection score threshold (currently: 7)
  N         frame sequence number
  TIMESTAMP frame timestamp: h:mm:ss.sss
  
  Also outputs a vertical list of frames with header:
  T   SCORE:TH     N HMS
  Where:
  T     frame type: K I P B, with S appended when the frame starts a scenecut
  SCORE scene cut detection score
  TH    scene cut detection score threshold
  N     frame sequence number
  HMS   frame timestamp: h:mm:ss.sss

Examples

  • list key frames and scenecuts: videoframes somevideo.mkv
  • list scenecuts: videoframes -s somevideo.mkv
  • dump scenecuts into cuts.txt: videoframes -s somevideo.mkv | tee cuts.txt

Screenshot

Screenshot

_sc001.jpg

Screenshot

_sc002.jpg

Screenshot

#!/bin/bash
# Scenecut detection threshold: range [0,100].
# When 2 consecutive frames attain a lavfi.scd.score greater than THRESHOLD, then the second frame is selected as scene start.
THRESHOLD=7
# Your display width expressed in pixel
DISPLAY_W=2560
# Number of columns for the contact-sheet thumbnails grid
COLUMNS=4
ROWS=$COLUMNS # recommended
#
FONT=/usr/share/fonts/truetype/freefont/FreeMono.ttf
# Assume a contact-sheet width equal to about 92% of the display width.
# Compute the width W of each frame thumbnail (before tiling into a mosaic).
W=$((DISPLAY_W*92/100/COLUMNS))
FILTKS="
select='if(key,1,2)':outputs=2 [k1][k0],
[k1] drawtext=text='K':font=monospace:fontsize=20:box=1:boxborderw=1:y=1 [a],
[k0] metadata=mode=select:key=lavfi.scd.score:value=${THRESHOLD}:function=greater,
drawtext=text='%{pict_type}':font=monospace:fontsize=20:box=1:boxborderw=1:y=1 [b],
[a][b] interleave,
"
FILTK="
select='key',
drawtext=text='K':font=monospace:fontsize=20:box=1:boxborderw=1:y=1,
"
FILTS="
metadata=mode=select:key=lavfi.scd.score:value=${THRESHOLD}:function=greater,
select='if(key,1,2)':outputs=2 [k1][k0],
[k1] drawtext=text='K': font=monospace:fontsize=20:box=1:boxborderw=1:y=1 [a],
[k0] drawtext=text='%{pict_type}':font=monospace:fontsize=20:box=1:boxborderw=1:y=1 [b],
[a][b] interleave,\
"
case $1 in
-k )
FILT=$FILTK; shift ;;
-s )
FILT=$FILTS; shift ;;
* )
FILT=$FILTKS ;;
esac
[[ -r $1 ]] || {
echo "
Usage: $(basename $0) [OPTION] FILE
Extract key frames and scenecuts from video FILE.
-k key frames only
-s scenecuts only
Creates jpg pictures containing a mosaic of thumbnails of the selected frames.
Each thumbnail is overlayed with a box [T SCORE:TH N HMS], where:
T frame type: K I P B (every K frame is also a I frame)
SCORE scene cut detection score: d.ddd
TH scene cut detection score threshold (currently: ${THRESHOLD})
N frame sequence number
HMS frame timestamp: h:mm:ss.sss
Also outputs a vertical list of frames with header:
T SCORE:TH N HMS
Where:
T frame type: K I P B, with S appended when the frame starts a scenecut
SCORE scene cut detection score
TH scene cut detection score threshold
N frame sequence number
HMS frame timestamp: h:mm:ss.sss
"
exit
}
cd $(dirname "$1")
inputfile=$(basename "$1")
ffmpeg -nostdin -nostats -i "$inputfile" -map 0:v:0 -filter:v "
showinfo=checksum=0,
scale=${W}:-1,
scdet=threshold=0:sc_pass=0,
drawtext=text='Z %{metadata\:lavfi.scd.score\:noscore}\:${THRESHOLD} %{n} %{pts\:hms}':
font=monospace:fontsize=20:box=1:boxborderw=1:y=1:boxcolor=yellow,
${FILT}
metadata=mode=print,
showinfo=checksum=0,
tile=${COLUMNS}x${ROWS}" -vsync passthrough -qscale:v 2 _sc%03d.jpg 2>&1 |
#pv -l > videoframes.log; exit
gawk -F' +[a-z_]+:' -v th=${THRESHOLD} '
BEGIN { print "T SCORE:TH N HMS" }
fr==0 && /Stream #0:.+Video:/ { patsplit($0,a,"[^ ]+ fps"); fr = a[1]+0 } # Get frame rate
/^\[Parsed_showinfo_0 [^\]]+\] config in time_base:/ { # Get precise frame rate
patsplit($0,a,"[0-9]+/[0-9]+"); split(a[2],a,"/"); fr=a[1]/a[2] }
/^\[Parsed_showinfo_[^0][^\]]+\] config in time_base:/ { # Get precise time base after interleave
patsplit($0,a,"[0-9]+/[0-9]+"); split(a[1],a,"/"); tb=a[1]/a[2] }
/^\[Parsed_metadata_[^\]]+\] lavfi\.scd\.score/ { split($0,a,"="); score = a[2] }
/^\[Parsed_showinfo_[^0][^\]]+\] n:/ {
t=$3*tb # $3 is PTS in time base units
h=int(t/3600); m=int(t%3600/60); s=t%60 # t is TIME in sec.millisec
type = $10+0 ? "K" : substr($11,1,1)
scene = score > th ? "S" : " "
printf "%s%s %6.3f:%s %6.f %02d:%02d:%06.3f\n",type,scene,score,th,t*fr,h,m,s }'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment