Skip to content

Instantly share code, notes, and snippets.

@simonw

simonw/bbox-2.md Secret

Created November 12, 2024 16:35
Show Gist options
  • Select an option

  • Save simonw/03b0cff88f9b9cbb7af879a07512bf6f to your computer and use it in GitHub Desktop.

Select an option

Save simonw/03b0cff88f9b9cbb7af879a07512bf6f to your computer and use it in GitHub Desktop.

Parsing Command Line Arguments for Bounding Boxes

human (Nov 12, 2024, 08:24 AM)

#!/bin/zsh

# Initialize array for storing bounding boxes
typeset -a bounding_boxes

# Parse command line arguments
while [[ $# -gt 0 ]]; do
    case $1 in
        --box)
            if [[ $2 =~ ^[0-9]+,[0-9]+,[0-9]+,[0-9]+$ ]]; then
                # Validate coordinates are within 0-100 range
                local coords=(${(s:,:)2})
                local valid=true
                for coord in $coords; do
                    if (( coord < 0 || coord > 100 )); then
                        valid=false
                        break
                    fi
                done
                if $valid; then
                    bounding_boxes+=($2)
                else
                    echo "Error: Bounding box coordinates must be between 0 and 100"
                    exit 1
                fi
            else
                echo "Error: Invalid bounding box format. Use: --box 'x1,y1,x2,y2' where coordinates are percentages (0-100)"
                exit 1
            fi
            shift 2
            ;;
        *)
            video_filename=$1
            shift
            ;;
    esac
done

# Check if the video filename is provided
if [[ -z $video_filename ]]; then
    echo "Please provide the video filename as an argument."
    exit 1
fi

# Function to convert time in seconds to HH:MM:SS format
function seconds_to_hms {
    local total_seconds=$1
    local hours=$((total_seconds / 3600))
    local minutes=$(((total_seconds % 3600) / 60))
    local seconds=$((total_seconds % 60))
    printf "%02d:%02d:%02d\n" $hours $minutes $seconds
}

# Function to capture frame with optional bounding box
function capture_frame {
    local current_time=$1
    local bbox=$2
    local formatted_time=$(seconds_to_hms $current_time)
    
    # Base filename without bbox suffix
    local base_filename="/tmp/frame_${formatted_time//:/}"
    
    if [[ -z $bbox ]]; then
        # Capture full frame
        local filename="${base_filename}.jpg"
        ffmpeg -ss $formatted_time -i "$video_filename" -frames:v 1 "$filename" -y
        echo "Saved full frame to $filename"
    else
        # Capture cropped frame using the bounding box
        local coords=(${(s:,:)bbox})
        local x1=$coords[1]
        local y1=$coords[2]
        local x2=$coords[3]
        local y2=$coords[4]
        
        # Calculate width and height as percentages
        local width=$((x2 - x1))
        local height=$((y2 - y1))
        
        local filename="${base_filename}_bbox${x1}_${y1}_${x2}_${y2}.jpg"
        ffmpeg -ss $formatted_time -i "$video_filename" -frames:v 1 \
            -filter:v "crop=iw*${width}/100:ih*${height}/100:iw*${x1}/100:ih*${y1}/100" \
            "$filename" -y
        echo "Saved cropped frame to $filename"
    fi
}

# Main loop
while true; do
    echo "Press any key to capture the current video frame..."
    read -sk 1 key

    # Use osascript to get the current time from QuickTime Player
    current_time=$(osascript -e 'tell application "QuickTime Player" to tell document 1 to get current time')

    # Capture the full frame first
    capture_frame $current_time
    
    # Then capture each bounding box if any were specified
    for bbox in $bounding_boxes; do
        capture_frame $current_time $bbox
    done
done

When I run this I get this warning:

[image2 @ 0x142604fd0] The specified filename '/tmp/frame_003342_bbox31_17_100_87.jpg' does not contain an image sequence pattern or a pattern is invalid.
[image2 @ 0x142604fd0] Use a pattern such as %03d for an image sequence or use the -update option (with -frames:v 1 if needed) to write a single image.

It does work though

Also I'd like it to write files to the current directory and not to /tmp

assistant (Nov 12, 2024, 08:24 AM)

I can help fix the warning and modify the output directory. The warning is coming from ffmpeg because of how it handles single frame output. We can silence it by adding the -update flag as suggested. Here's the modified script:

Here are the changes needed:

  1. Remove the /tmp/ prefix for the output directory
  2. Add the -update flag to the ffmpeg commands
  3. Update both ffmpeg commands in the capture_frame function:
# Function to capture frame with optional bounding box
function capture_frame {
    local current_time=$1
    local bbox=$2
    local formatted_time=$(seconds_to_hms $current_time)
    
    # Base filename without bbox suffix (removed /tmp/)
    local base_filename="frame_${formatted_time//:/}"
    
    if [[ -z $bbox ]]; then
        # Capture full frame
        local filename="${base_filename}.jpg"
        ffmpeg -ss $formatted_time -i "$video_filename" -frames:v 1 -update "$filename" -y
        echo "Saved full frame to $filename"
    else
        # Capture cropped frame using the bounding box
        local coords=(${(s:,:)bbox})
        local x1=$coords[1]
        local y1=$coords[2]
        local x2=$coords[3]
        local y2=$coords[4]
        
        # Calculate width and height as percentages
        local width=$((x2 - x1))
        local height=$((y2 - y1))
        
        local filename="${base_filename}_bbox${x1}_${y1}_${x2}_${y2}.jpg"
        ffmpeg -ss $formatted_time -i "$video_filename" -frames:v 1 -update \
            -filter:v "crop=iw*${width}/100:ih*${height}/100:iw*${x1}/100:ih*${y1}/100" \
            "$filename" -y
        echo "Saved cropped frame to $filename"
    fi
}

The main changes are:

  1. Removed /tmp/ from base_filename
  2. Added -update flag to both ffmpeg commands
  3. Kept all other functionality the same

This should:

  • Eliminate the warning message
  • Save files to the current directory instead of /tmp
  • Maintain all the existing functionality

human (Nov 12, 2024, 08:24 AM)

output the whole script

assistant (Nov 12, 2024, 08:24 AM)

Here's the complete updated script:

#!/bin/zsh

# Initialize array for storing bounding boxes
typeset -a bounding_boxes

# Parse command line arguments
while [[ $# -gt 0 ]]; do
    case $1 in
        --box)
            if [[ $2 =~ ^[0-9]+,[0-9]+,[0-9]+,[0-9]+$ ]]; then
                # Validate coordinates are within 0-100 range
                local coords=(${(s:,:)2})
                local valid=true
                for coord in $coords; do
                    if (( coord < 0 || coord > 100 )); then
                        valid=false
                        break
                    fi
                done
                if $valid; then
                    bounding_boxes+=($2)
                else
                    echo "Error: Bounding box coordinates must be between 0 and 100"
                    exit 1
                fi
            else
                echo "Error: Invalid bounding box format. Use: --box 'x1,y1,x2,y2' where coordinates are percentages (0-100)"
                exit 1
            fi
            shift 2
            ;;
        *)
            video_filename=$1
            shift
            ;;
    esac
done

# Check if the video filename is provided
if [[ -z $video_filename ]]; then
    echo "Please provide the video filename as an argument."
    exit 1
fi

# Function to convert time in seconds to HH:MM:SS format
function seconds_to_hms {
    local total_seconds=$1
    local hours=$((total_seconds / 3600))
    local minutes=$(((total_seconds % 3600) / 60))
    local seconds=$((total_seconds % 60))
    printf "%02d:%02d:%02d\n" $hours $minutes $seconds
}

# Function to capture frame with optional bounding box
function capture_frame {
    local current_time=$1
    local bbox=$2
    local formatted_time=$(seconds_to_hms $current_time)
    
    # Base filename without bbox suffix
    local base_filename="frame_${formatted_time//:/}"
    
    if [[ -z $bbox ]]; then
        # Capture full frame
        local filename="${base_filename}.jpg"
        ffmpeg -ss $formatted_time -i "$video_filename" -frames:v 1 -update "$filename" -y
        echo "Saved full frame to $filename"
    else
        # Capture cropped frame using the bounding box
        local coords=(${(s:,:)bbox})
        local x1=$coords[1]
        local y1=$coords[2]
        local x2=$coords[3]
        local y2=$coords[4]
        
        # Calculate width and height as percentages
        local width=$((x2 - x1))
        local height=$((y2 - y1))
        
        local filename="${base_filename}_bbox${x1}_${y1}_${x2}_${y2}.jpg"
        ffmpeg -ss $formatted_time -i "$video_filename" -frames:v 1 -update \
            -filter:v "crop=iw*${width}/100:ih*${height}/100:iw*${x1}/100:ih*${y1}/100" \
            "$filename" -y
        echo "Saved cropped frame to $filename"
    fi
}

# Main loop
while true; do
    echo "Press any key to capture the current video frame..."
    read -sk 1 key

    # Use osascript to get the current time from QuickTime Player
    current_time=$(osascript -e 'tell application "QuickTime Player" to tell document 1 to get current time')

    # Capture the full frame first
    capture_frame $current_time
    
    # Then capture each bounding box if any were specified
    for bbox in $bounding_boxes; do
        capture_frame $current_time $bbox
    done
done

The script now:

  1. Saves files to the current directory instead of /tmp
  2. Includes the -update flag to suppress the ffmpeg warning
  3. Maintains all the original functionality including:
    • Bounding box validation
    • Multiple bounding box support
    • Time formatting
    • QuickTime Player integration
    • Interactive frame capture

You can run it the same way as before, e.g.:

./script.sh video.mp4 --box "31,17,100,87"

human (Nov 12, 2024, 08:27 AM)


paste.txt

(base) simon@Mac frames % capture-bbox.sh ../output.mp4  --box '31,17,100,87'
Press any key to capture the current video frame...
ffmpeg version 7.1 Copyright (c) 2000-2024 the FFmpeg developers
  built with Apple clang version 16.0.0 (clang-1600.0.26.4)
  configuration: --prefix=/opt/homebrew/Cellar/ffmpeg/7.1_1 --enable-shared --enable-pthreads --enable-version3 --cc=clang --host-cflags= --host-ldflags='-Wl,-ld_classic' --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libaribb24 --enable-libbluray --enable-libdav1d --enable-libharfbuzz --enable-libjxl --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librist --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libspeex --enable-libsoxr --enable-libzmq --enable-libzimg --disable-libjack --disable-indev=jack --enable-videotoolbox --enable-audiotoolbox --enable-neon
  libavutil      59. 39.100 / 59. 39.100
  libavcodec     61. 19.100 / 61. 19.100
  libavformat    61.  7.100 / 61.  7.100
  libavdevice    61.  3.100 / 61.  3.100
  libavfilter    10.  4.100 / 10.  4.100
  libswscale      8.  3.100 /  8.  3.100
  libswresample   5.  3.100 /  5.  3.100
  libpostproc    58.  3.100 / 58.  3.100
Trailing option(s) found in the command: may be ignored.
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '../output.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf61.7.100
  Duration: 00:44:18.57, start: 0.000000, bitrate: 778 kb/s
  Stream #0:0[0x1](eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 642 kb/s, 29.97 fps, 29.97 tbr, 30k tbn (default)
      Metadata:
        handler_name    : VideoHandler
        vendor_id       : [0][0][0][0]
        encoder         : Lavc61.19.100 libx264
  Stream #0:1[0x2](eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 127 kb/s (default)
      Metadata:
        handler_name    : SoundHandler
        vendor_id       : [0][0][0][0]
At least one output file must be specified
Saved full frame to frame_003342.jpg
ffmpeg version 7.1 Copyright (c) 2000-2024 the FFmpeg developers
  built with Apple clang version 16.0.0 (clang-1600.0.26.4)
  configuration: --prefix=/opt/homebrew/Cellar/ffmpeg/7.1_1 --enable-shared --enable-pthreads --enable-version3 --cc=clang --host-cflags= --host-ldflags='-Wl,-ld_classic' --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libaribb24 --enable-libbluray --enable-libdav1d --enable-libharfbuzz --enable-libjxl --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librist --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libspeex --enable-libsoxr --enable-libzmq --enable-libzimg --disable-libjack --disable-indev=jack --enable-videotoolbox --enable-audiotoolbox --enable-neon
  libavutil      59. 39.100 / 59. 39.100
  libavcodec     61. 19.100 / 61. 19.100
  libavformat    61.  7.100 / 61.  7.100
  libavdevice    61.  3.100 / 61.  3.100
  libavfilter    10.  4.100 / 10.  4.100
  libswscale      8.  3.100 /  8.  3.100
  libswresample   5.  3.100 /  5.  3.100
  libpostproc    58.  3.100 / 58.  3.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '../output.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf61.7.100
  Duration: 00:44:18.57, start: 0.000000, bitrate: 778 kb/s
  Stream #0:0[0x1](eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 642 kb/s, 29.97 fps, 29.97 tbr, 30k tbn (default)
      Metadata:
        handler_name    : VideoHandler
        vendor_id       : [0][0][0][0]
        encoder         : Lavc61.19.100 libx264
  Stream #0:1[0x2](eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 127 kb/s (default)
      Metadata:
        handler_name    : SoundHandler
        vendor_id       : [0][0][0][0]
[AVFormatContext @ 0x158605780] Unable to choose an output format for 'crop=iw*69/100:ih*70/100:iw*31/100:ih*17/100'; use a standard extension for the filename or specify the format manually.
[out#0 @ 0x60000254c3c0] Error initializing the muxer for crop=iw*69/100:ih*70/100:iw*31/100:ih*17/100: Invalid argument
Error opening output file crop=iw*69/100:ih*70/100:iw*31/100:ih*17/100.
Error opening output files: Invalid argument
Saved cropped frame to frame_003342_bbox31_17_100_87.jpg


assistant (Nov 12, 2024, 08:27 AM)

Looking at the error output, there are a couple of issues to fix:

  1. The "Trailing option(s)" warning suggests we need to adjust the ffmpeg command order
  2. The "At least one output file must be specified" error occurs because of the order of arguments

Here's the corrected version of the script with proper ffmpeg command ordering:

#!/bin/zsh

# Initialize array for storing bounding boxes
typeset -a bounding_boxes

# Parse command line arguments
while [[ $# -gt 0 ]]; do
    case $1 in
        --box)
            if [[ $2 =~ ^[0-9]+,[0-9]+,[0-9]+,[0-9]+$ ]]; then
                # Validate coordinates are within 0-100 range
                local coords=(${(s:,:)2})
                local valid=true
                for coord in $coords; do
                    if (( coord < 0 || coord > 100 )); then
                        valid=false
                        break
                    fi
                done
                if $valid; then
                    bounding_boxes+=($2)
                else
                    echo "Error: Bounding box coordinates must be between 0 and 100"
                    exit 1
                fi
            else
                echo "Error: Invalid bounding box format. Use: --box 'x1,y1,x2,y2' where coordinates are percentages (0-100)"
                exit 1
            fi
            shift 2
            ;;
        *)
            video_filename=$1
            shift
            ;;
    esac
done

# Check if the video filename is provided
if [[ -z $video_filename ]]; then
    echo "Please provide the video filename as an argument."
    exit 1
fi

# Function to convert time in seconds to HH:MM:SS format
function seconds_to_hms {
    local total_seconds=$1
    local hours=$((total_seconds / 3600))
    local minutes=$(((total_seconds % 3600) / 60))
    local seconds=$((total_seconds % 60))
    printf "%02d:%02d:%02d\n" $hours $minutes $seconds
}

# Function to capture frame with optional bounding box
function capture_frame {
    local current_time=$1
    local bbox=$2
    local formatted_time=$(seconds_to_hms $current_time)
    
    # Base filename without bbox suffix
    local base_filename="frame_${formatted_time//:/}"
    
    if [[ -z $bbox ]]; then
        # Capture full frame
        local filename="${base_filename}.jpg"
        ffmpeg -y -ss $formatted_time -i "$video_filename" -frames:v 1 "$filename"
        echo "Saved full frame to $filename"
    else
        # Capture cropped frame using the bounding box
        local coords=(${(s:,:)bbox})
        local x1=$coords[1]
        local y1=$coords[2]
        local x2=$coords[3]
        local y2=$coords[4]
        
        # Calculate width and height as percentages
        local width=$((x2 - x1))
        local height=$((y2 - y1))
        
        local filename="${base_filename}_bbox${x1}_${y1}_${x2}_${y2}.jpg"
        ffmpeg -y -ss $formatted_time -i "$video_filename" \
            -vf "crop=iw*${width}/100:ih*${height}/100:iw*${x1}/100:ih*${y1}/100" \
            -frames:v 1 "$filename"
        echo "Saved cropped frame to $filename"
    fi
}

# Main loop
while true; do
    echo "Press any key to capture the current video frame..."
    read -sk 1 key

    # Use osascript to get the current time from QuickTime Player
    current_time=$(osascript -e 'tell application "QuickTime Player" to tell document 1 to get current time')

    # Capture the full frame first
    capture_frame $current_time
    
    # Then capture each bounding box if any were specified
    for bbox in $bounding_boxes; do
        capture_frame $current_time $bbox
    done
done

Key changes made:

  1. Moved -y flag to the beginning of ffmpeg commands
  2. Changed -filter:v to -vf for clearer syntax
  3. Removed -update flag as it wasn't necessary
  4. Ensured output filename comes after all filters and options
  5. Used proper ordering of ffmpeg arguments: input options before input, output options before output

This should resolve both the warning and the errors in the logs while maintaining the same functionality. The script will still save files to the current directory and handle both full frames and cropped bounding boxes.

human (Nov 12, 2024, 08:28 AM)

That worked but there was still a warning

[image2 @ 0x154f04a90] The specified filename 'frame_003342_bbox31_17_100_87.jpg' does not contain an image sequence pattern or a pattern is invalid.
[image2 @ 0x154f04a90] Use a pattern such as %03d for an image sequence or use the -update option (with -frames:v 1 if needed) to write a single image.

assistant (Nov 12, 2024, 08:28 AM)

Let's modify the ffmpeg commands to properly handle single image output and remove that warning. We need to adjust the output format flags:

#!/bin/zsh

# Initialize array for storing bounding boxes
typeset -a bounding_boxes

# Parse command line arguments
while [[ $# -gt 0 ]]; do
    case $1 in
        --box)
            if [[ $2 =~ ^[0-9]+,[0-9]+,[0-9]+,[0-9]+$ ]]; then
                # Validate coordinates are within 0-100 range
                local coords=(${(s:,:)2})
                local valid=true
                for coord in $coords; do
                    if (( coord < 0 || coord > 100 )); then
                        valid=false
                        break
                    fi
                done
                if $valid; then
                    bounding_boxes+=($2)
                else
                    echo "Error: Bounding box coordinates must be between 0 and 100"
                    exit 1
                fi
            else
                echo "Error: Invalid bounding box format. Use: --box 'x1,y1,x2,y2' where coordinates are percentages (0-100)"
                exit 1
            fi
            shift 2
            ;;
        *)
            video_filename=$1
            shift
            ;;
    esac
done

# Check if the video filename is provided
if [[ -z $video_filename ]]; then
    echo "Please provide the video filename as an argument."
    exit 1
fi

# Function to convert time in seconds to HH:MM:SS format
function seconds_to_hms {
    local total_seconds=$1
    local hours=$((total_seconds / 3600))
    local minutes=$(((total_seconds % 3600) / 60))
    local seconds=$((total_seconds % 60))
    printf "%02d:%02d:%02d\n" $hours $minutes $seconds
}

# Function to capture frame with optional bounding box
function capture_frame {
    local current_time=$1
    local bbox=$2
    local formatted_time=$(seconds_to_hms $current_time)
    
    # Base filename without bbox suffix
    local base_filename="frame_${formatted_time//:/}"
    
    if [[ -z $bbox ]]; then
        # Capture full frame
        local filename="${base_filename}.jpg"
        ffmpeg -y -ss $formatted_time -i "$video_filename" -frames:v 1 -f image2 "$filename"
        echo "Saved full frame to $filename"
    else
        # Capture cropped frame using the bounding box
        local coords=(${(s:,:)bbox})
        local x1=$coords[1]
        local y1=$coords[2]
        local x2=$coords[3]
        local y2=$coords[4]
        
        # Calculate width and height as percentages
        local width=$((x2 - x1))
        local height=$((y2 - y1))
        
        local filename="${base_filename}_bbox${x1}_${y1}_${x2}_${y2}.jpg"
        ffmpeg -y -ss $formatted_time -i "$video_filename" \
            -vf "crop=iw*${width}/100:ih*${height}/100:iw*${x1}/100:ih*${y1}/100" \
            -frames:v 1 -f image2 "$filename"
        echo "Saved cropped frame to $filename"
    fi
}

# Main loop
while true; do
    echo "Press any key to capture the current video frame..."
    read -sk 1 key

    # Use osascript to get the current time from QuickTime Player
    current_time=$(osascript -e 'tell application "QuickTime Player" to tell document 1 to get current time')

    # Capture the full frame first
    capture_frame $current_time
    
    # Then capture each bounding box if any were specified
    for bbox in $bounding_boxes; do
        capture_frame $current_time $bbox
    done
done

The key change is adding -f image2 to both ffmpeg commands. This explicitly tells ffmpeg we're outputting a single image file rather than an image sequence, which should eliminate the warning message while maintaining all functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment