Skip to content

Instantly share code, notes, and snippets.

@brianseeders
Forked from RichardBronosky/URL_transforms.sh
Last active August 29, 2015 13:56
Show Gist options
  • Save brianseeders/8920420 to your computer and use it in GitHub Desktop.
Save brianseeders/8920420 to your computer and use it in GitHub Desktop.
http://m. ########## normal URLs ########## ?embed=copyright
http://m.wsbtv.com/news/news/minute-minute-updates-winter-storm/nc6BK/?embed=copyright
http://m.wsbtv.com/news/news/minute-minute-updates-winter-storm/nc6BK/?embed=copyright
http://m.wsbtv.com/news/news/minute-minute-updates-winter-storm/nc6BK/?embed=copyright
http://m.wsbtv.com/news/news/minute-minute-updates-winter-storm/nc6BK/?embed=copyright
http://m.wsbtv.com/news/news/minute-minute-updates-winter-storm/nc6BK/?embed=copyright
http://m.wsbtv.com/news/news/minute-minute-updates-winter-storm/nc6BK/?embed=copyright
http://m.wsbtv.com/news/news/minute-minute-updates-winter-storm/nc6BK/?embed=copyright
http://m. ########## URLs with www in them ########## ?embed=copyright
http://m.wsbtv.com/news/news/www.minute-minute-updates-winter-storm/nc6BK/?embed=copyright
http://m.wsbtv.com/news/news/www.minute-minute-updates-winter-storm/nc6BK/?embed=copyright
http://m.wsbtv.com/news/news/www.minute-minute-updates-winter-storm/nc6BK/?embed=copyright
http://m.wsbtv.com/news/news/www.minute-minute-updates-winter-storm/nc6BK/?embed=copyright
http://m.wsbtv.com/news/news/www.minute-minute-updates-winter-storm/nc6BK/?embed=copyright
http://m.wsbtv.com/news/news/www.minute-minute-updates-winter-storm/nc6BK/?embed=copyright
http://m.wsbtv.com/news/news/www.minute-minute-updates-winter-storm/nc6BK/?embed=copyright
http://m. ########## URLs with http:// in them ########## ?embed=copyright
http://m.wsbtv.com/news/news/http://www.minute-minute-updates-winter-storm/nc6BK/?embed=copyright
http://m.wsbtv.com/news/news/http://www.minute-minute-updates-winter-storm/nc6BK/?embed=copyright
http://m.wsbtv.com/news/news/http://www.minute-minute-updates-winter-storm/nc6BK/?embed=copyright
http://m.wsbtv.com/news/news/http://www.minute-minute-updates-winter-storm/nc6BK/?embed=copyright
http://m.wsbtv.com/news/news/http://www.minute-minute-updates-winter-storm/nc6BK/?embed=copyright
http://m.wsbtv.com/news/news/http://www.minute-minute-updates-winter-storm/nc6BK/?embed=copyright
http://m.wsbtv.com/news/news/http://www.minute-minute-updates-winter-storm/nc6BK/?embed=copyright
#!/bin/bash
## The regex string to use for all detail page transforms is:
regex='s/([^:]{4,5}:\/\/)?((www|m)\.)?(.*)/http:\/\/m.\4?embed=copyright/'
## It is hardened against transient protocol and subdomain presence.
## It does not account for preexisting query strings.
## That would require 2 separate regexes and is not likely to occur in our feed.
## Here is the test suite.
sed -E -e "$regex" << EOF
########## normal URLs ##########
wsbtv.com/news/news/minute-minute-updates-winter-storm/nc6BK/
m.wsbtv.com/news/news/minute-minute-updates-winter-storm/nc6BK/
www.wsbtv.com/news/news/minute-minute-updates-winter-storm/nc6BK/
http://m.wsbtv.com/news/news/minute-minute-updates-winter-storm/nc6BK/
http://www.wsbtv.com/news/news/minute-minute-updates-winter-storm/nc6BK/
https://m.wsbtv.com/news/news/minute-minute-updates-winter-storm/nc6BK/
https://www.wsbtv.com/news/news/minute-minute-updates-winter-storm/nc6BK/
########## URLs with www in them ##########
wsbtv.com/news/news/www.minute-minute-updates-winter-storm/nc6BK/
m.wsbtv.com/news/news/www.minute-minute-updates-winter-storm/nc6BK/
www.wsbtv.com/news/news/www.minute-minute-updates-winter-storm/nc6BK/
http://m.wsbtv.com/news/news/www.minute-minute-updates-winter-storm/nc6BK/
http://www.wsbtv.com/news/news/www.minute-minute-updates-winter-storm/nc6BK/
https://m.wsbtv.com/news/news/www.minute-minute-updates-winter-storm/nc6BK/
https://www.wsbtv.com/news/news/www.minute-minute-updates-winter-storm/nc6BK/
########## URLs with http:// in them ##########
wsbtv.com/news/news/http://www.minute-minute-updates-winter-storm/nc6BK/
m.wsbtv.com/news/news/http://www.minute-minute-updates-winter-storm/nc6BK/
www.wsbtv.com/news/news/http://www.minute-minute-updates-winter-storm/nc6BK/
http://m.wsbtv.com/news/news/http://www.minute-minute-updates-winter-storm/nc6BK/
http://www.wsbtv.com/news/news/http://www.minute-minute-updates-winter-storm/nc6BK/
https://m.wsbtv.com/news/news/http://www.minute-minute-updates-winter-storm/nc6BK/
https://www.wsbtv.com/news/news/http://www.minute-minute-updates-winter-storm/nc6BK/
EOF
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment