Skip to content

Instantly share code, notes, and snippets.

@dmonagha
Last active March 28, 2018 14:46
Show Gist options
  • Save dmonagha/94c21487b85fa00c1cdc2d4efc708076 to your computer and use it in GitHub Desktop.
Save dmonagha/94c21487b85fa00c1cdc2d4efc708076 to your computer and use it in GitHub Desktop.
Search and rescue
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Lorem Ipsum</title>
<link rel="stylesheet" href="css/style.css">
<!--[if IE]>
<script src="http://html5shiv.googlecode.com/svn/trunk/html5.js"></script>
<![endif]-->
</head>
<body id="home">
<h1>Lorem Ipsum</h1>
<p>Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Vestibulum tortor quam, feugiat vitae, +18004384357 eget, tempor sit amet, ante. 18004384357 eu libero sit amet quam egestas semper. Aenean ultricies mi vitae est. Mauris placerat eleifend leo. Quisque sit amet est et sapien ullamcorper pharetra. Vestibulum erat wisi, condimentum sed, commodo vitae, ornare 8004384357 amet, wisi. Aenean fermentum, elit eget +1.800.438.4357 condimentum, eros ipsum rutrum orci, sagittis tempus lacus enim ac dui. Donec non enim in turpis pulvinar facilisis. Ut felis. Praesent dapibus, neque id cursus faucibus, tortor neque egestas augue, eu vulputate magna eros eu erat. Aliquam erat volutpat. Nam dui mi, tincidunt quis, accumsan porttitor, facilisis luctus, metus</p>
<p>Pellentesque habitant morbi tristique senectus et 1.800.438.4357 et malesuada fames ac turpis egestas. 800.438.4357 tortor quam, feugiat vitae, ultricies eget, tempor sit amet, ante. Donec eu libero sit amet quam egestas semper. Aenean ultricies mi vitae est. Mauris placerat eleifend leo. +1-800-438-4357 sit amet est et sapien ullamcorper pharetra. 1-800-438-4357 erat wisi, condimentum sed, commodo vitae, ornare sit amet, wisi. Aenean fermentum, elit eget tincidunt condimentum, eros ipsum rutrum orci, sagittis tempus lacus enim ac dui. Donec non enim in turpis pulvinar 800-438-4357. Ut felis. Praesent dapibus, neque +1(800)4384357 cursus faucibus, tortor neque egestas augue, eu vulputate magna eros eu erat. Aliquam erat volutpat. Nam dui mi, tincidunt quis, accumsan porttitor, facilisis luctus, metus</p>
<p>Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Vestibulum tortor quam, feugiat vitae, 1(800)4384357 eget, tempor sit amet, ante. (800)4384357 eu libero +1 (800) 438 4357 amet 1 (800) 438 4357 egestas semper. Aenean ultricies (800) 438 4357 vitae est. Mauris +1800GETHELP eleifend leo. Quisque sit amet est et sapien ullamcorper pharetra. Vestibulum erat wisi, condimentum sed, commodo vitae, ornare sit amet, wisi. Aenean fermentum, elit eget tincidunt condimentum, eros ipsum rutrum orci, sagittis tempus lacus enim ac dui. Donec non enim in 1800GETHELP pulvinar 800GETHELP. Ut felis. Praesent dapibus, neque id cursus faucibus, tortor neque egestas augue, eu vulputate magna eros eu erat. Aliquam erat volutpat. Nam dui mi, tincidunt quis, accumsan porttitor, facilisis luctus, metus</p>
</body>
</html>
#!/usr/bin/env bash
#Please note this is a working example using GNU sed. If using OSX:
# $ brew install gnu-sed --with-default-names
# and update your PATH
#generate some sample html
lowerlimit=0
upperlimit=100
for i in $(seq $lowerlimit $upperlimit)
do
cp "/tmp/sample.txt" "/var/www/html/sample$i.htm"
done
#find and replace requested text
find /var/www/html -type f \( -iname \*.html -o -iname \*.htm \) -exec sed -Ei "s#(\+18004384357|\
18004384357|8004384357|\
\+1\.800\.438\.4357|1.800.438.4357|800.438.4357|\
\+1-800-438-4357|1-800-438-4357|800-438-4357|\
\+1\(800\)4384357|1\(800\)4384357|\(800\)4384357|\
\+1 \(800\) 438 4357|1 \(800\) 438 4357|\(800\) 438 4357|\
\+1800GETHELP|1800GETHELP|800GETHELP)#202-456-1414#g" {} +
@murphstein
Copy link

Tedious regex work, but missed some of my test cases:

Line 1: 1-800-GET-HELP blah -- uppercase and dashes
Line 2: 1.800.get.help blah -- lowercase and dots
Line 3: 1 800 GET HELP blah -- uppercase and spaces
Line 4: 1-800-Get-Help blah -- camelcase and dashes
Line 5: 1-800 GetHelp blah -- camelcase and mixed
Line 6: 202-456-1414 blah -- digits and dashes
Line 7: 202-456-1414 blah -- digits and dots
Line 8: 202-456-1414 blah -- digits and spaces
Line 9: 202-456-1414 blah blah -- no separators
Line A: pattern at line end -- 202-456-1414
Neg 1: as a directory name -- /var/www/202-456-1414_files blah
Neg 2: Social Security # -- 800-45-4357 blah

@dmonagha
Copy link
Author

dmonagha commented Mar 27, 2018

I actually totally missed this comment, just seeing it now. Likely you'll never see this comment either @murphstein due to this: isaacs/github#21.

My apologies for my performance on the code review, it was my first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment