Skip to content

Instantly share code, notes, and snippets.

@psorensen
Last active August 15, 2019 03:32
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save psorensen/f45bd4ceabca665a445f5144abf5c694 to your computer and use it in GitHub Desktop.
Save psorensen/f45bd4ceabca665a445f5144abf5c694 to your computer and use it in GitHub Desktop.

Generate Search & Replace Script For All Applicable http-to-https Image Assets.

  1. run wp --allow-root db query "select post_content from wp_posts where post_content like '%<img src%';" --skip-column-names --silent | grep -oP "<img\s+.*?src=['\"](.*?)['\"].*?>" | grep -oP "https?://[^\"]*" | sort | uniq
  2. using parse_url(), create a PHP function* to find unique hosts and add first occurence of each to an array of links.
  3. Itterate through new list of links and run through a CURL function** to determine
  4. Add links with 200-300 status to an array
  5. Produce Search & Replace script based of returned values

Function Reference

** Gather assets by unique host name

function get_assets_by_unique_host( $links = [] ) {
    $temp = [];
    
    $links = [];
    
    foreach ( $links as $link ) {
        $parsed = parse_url( $link );
        $host = $parsed['host'];
        if ( ! in_array( $host, $temp ) ) {
            array_push( $temp, $host );
            array_push( $links, $link );
        }
    }

    unset( $temp );

    return $links;
}

** Get Status via CURL

function get_status( $ur l) {
    $c = curl_init();
    curl_setopt( $c, CURLOPT_CONNECTTIMEOUT , 5); 
    curl_setopt( $c, CURLOPT_TIMEOUT, 30 ); //timeout in seconds
    curl_setopt( $c, CURLOPT_HEADER, true );
    curl_setopt( $c, CURLOPT_NOBODY, true );
    curl_setopt( $c, CURLOPT_SSL_VERIFYPEER, false );
    curl_setopt( $c, CURLOPT_SSL_VERIFYHOST, true );
    curl_setopt( $c, CURLOPT_URL, $url);
    curl_exec( $c );
    $status = curl_getinfo( $c, CURLINFO_HTTP_CODE );
    curl_close($c);
    return $status;
}
@jasondewitt
Copy link

because I'm a crazy person

wp --allow-root db query "select post_content from wp_posts where post_content like '%<img src%';" --skip-column-names --silent | grep -oP "<img\s+.*?src=['\"](.*?)['\"].*?>" | grep -oP "https?://[^\"]*" | sort | uniq

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment