Skip to content

Instantly share code, notes, and snippets.

@requaos
Created March 12, 2019 21:36
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save requaos/a36a335808c678c6301dcc4cb238c6a2 to your computer and use it in GitHub Desktop.
Save requaos/a36a335808c678c6301dcc4cb238c6a2 to your computer and use it in GitHub Desktop.
Fix relative image paths from crappy rss feeds, after the fact.
DO $$
declare
urlprefix text;
thing RECORD;
BEGIN
FOR thing in (SELECT * from stories where image like '/%') LOOP
urlprefix = substring(thing.url, '^.+?\/\/.+?\/');
UPDATE stories SET image = LEFT(urlprefix, LENGTH(urlprefix)-1) || thing.image WHERE id = thing.id;
END LOOP;
END; $$
/*
In this scenario, we found that some image refs were relative paths and not complete urls.
So we query for all rows where the image column begins with "/"
We apply a regular expression to the url, effectively extracting the schema and host
this includes a trailing slash which we trim off using LEFT()
we then update the record with this string concatenated with the relative url
We updated the codebase to check for this and perform this magic before insert first.
Then we ran this script to update the existing records
*/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment