regular expressions to find and replace text on web pages.
This example finds instances of "New Haven" and replaces them with "Ancient Sanctuary".
New Haven => Ancient Sanctuary
Write your own string and do find-and-replace there
var mystring = "New Haven is a great city."; mystring.replace("New Haven", "Ancient Sanctuary");
Get the entire page into a string
On the New Haven Wikipedia Page find all instances of "New Haven" and replace them with "Ancient Sanctuary". Replace the current HTML of the page with the find-and-replace'd version.
document.body.innerHTML = document.body.innerHTML.replace("New Haven", "Ancient Sanctuary");
hint: document.body.innerHTML doesn't like to be replaced more than once, you may want to refresh the page between applications
Whoops! Looks like it only replaced ONE of them.
You can use a
Regular Expression instead to find and replace ALL instances of "New Haven". In a regular expression,
g is a really common flag that stands for 'global'. For example
/New Haven/g will find all instances of "New Haven".
document.body.innerHTML = document.body.innerHTML.replace(/New Haven/g, "Ancient Sanctuary");
Congratulations, you've learned a very powerful trick! :D
New goal: go to the Cloud Computing Wikipedia Page, find all "cloud" and replace it with "butt".
document.body.innerHTML = document.body.innerHTML.replace(/cloud/g, "butt");
Whoops! That only gets lowercase clouds. Let's catch the uppercase ones too. (we can be lazy and just do two find-and-replaces).
document.body.innerHTML = document.body.innerHTML.replace(/cloud/g, "butt").replace(/Cloud/g, "Butt");
Success! teehee :)
More robust find-and-replace
For this session I'm satisfied with our find-and-replace. If we wanted to, what could we do to make this find-and-replace more robust? Here are just some ideas:
- Leave capitalization the same as we found it (notice we didn't catch "CLOUD" in our example above)
- Only find-and-replace text that displays to the user (not links)
- Make our find-and-replace more specific (should we replace "cloudy" with "butty"?)