Skip to content

Instantly share code, notes, and snippets.

@jacurtis jacurtis/regex.md
Last active May 13, 2019

Embed
What would you like to do?
Most Useful Regex's

Useful Regular Expressions

These are the most useful Regular Expressions that I find myself using on a regular basis


URLs

Test to see if a string is a valid website address or not.

All URLs

This captures example.com, www.example.com, http://example.com, http://www.example.com and will even catch url parameters at the end of any of these combinations such as example.com/?page=1 or www.example.com/cool-post/writing.html/?some=thing&cool=stuff. It works with any TLD, assuming that the TLD is 2 or 3 characters, so it does NOT work with new TLDs like .today, or .ninja. This is the regex i use most often as it allows the users to input whatever they want into a website form field. Some users copy and paste from the browser which brings over http:// (or even https), some people prepend www. while other people do not. So this allows flexibility, while still checking for a valid URL. Of course you would need to create a formula on the back end of your server now to parse all urls for consistency before storing them. But that is an issue for another day and another gist.

/^(http(s?):\/\/)?(www\.)?+[a-zA-Z0-9\.\-\_]+(\.[a-zA-Z]{2,3})+(\/[a-zA-Z0-9\_\-\s\.\/\?\%\#\&\=]*)?$/

Phone Numbers

Test to see if a string is a valid phone number format. This gets tricky because there are so many different phone number formats around the world. In my experience, it is best to try to narrow down a regex for the smallest region you can get. Occassionally I have to use a worldwide phone number regex to allow users to use any number. The problem is that as you open up the regex for so many formats it becomes more sloppy in allowing for example an invalid american number because it is trying to be open enough to allow for an Indian phone number as well. Whenever possible, I highly suggest that you narrow down your Regex's to the country you are planning on targeting for the highest level of accuracy.

All Phone Numbers

As warned above, this is the most universal of all the Regex's, but while trying to please everybody, it might let invalid numbers slip through the cracks. So use this only when no other option is available. There are more reliable Regex's available for specific countries which are bulletproof for said country. If you can get away with a single country, I suggest you scroll down and find that country and use it instead.

/\+?([\d|\(][\h|\(\d{3}\)|\.|\-|\d]{4,}\d)/
  • The above expression will allow almost any number combination through. For example +80 9800 98 91 or (801) 112-1123 or +12 43 42 68 34 83 or even just lazy number typers who just use numbers like 8011121122 or 801-112-1122 or 801.112.1122 *
American Phone Numbers

Probably my most frequently used Regex of all time. Phone numbers are inheritly complicated to check because there are so many formats. Also because I am American, I find myself most commonly checking American phone numbers. Like everything in America, we are probably trying to be the most different from any other country. A formal format is (801) 123-4567 and this regex will verify those. It will also verify other formats such as 801-123-4567 or 801.123.4567. The important thing is that it has an area code (meaning a 10 digit number as opposed to a 7 digit number). I suggest keeping this in there because anymore in America, area codes are all but required. Yes I remember the days when we used to just give our neighbors our last 4 digits of phone numbers and we only had to type the 7 digits into the phone. But those days are past and a 10 digit number is required for basically all phone calls. This Regex will even verify a lazy user (we have to truly embrace these users in America) that just types 10 digits without formatting like 8011234567. It even matches when people do weird combinations and leave out spaces like 801-123.1253 This Regex has you covered.

/^\W?\d*?\W*?(?<area>\d{3})\W*?(?<group1>\d{3})\W*?(?<group2>\d{4})\W*?$/
American Phone Numbers (With Capture Groups)

This regex follows similar patterns to the one above, but can be incredibly useful if you want to capture the parts of the phone number for parsing or reformatting. It utilizes named groups if your language supports those, otherwise you can remove the names. It will even capture an extension at the end if one is added.

/^(?'countryCode'\+?1)?[-.\s]?\(?(?'areaCode'\d{3,3})[)-.\s]{0,2}(?'phone1'\d{3,3})[-.\s]?(?'phone2'\d{4,4})\s*(?'extension'(?:x|ex|ext|extension|\s)\s*\d+)?$/

Email Addresses

This one is easier to test against. All emails follow the same pattern. On a simple level it is username@domain.tld. So we can test that one easily and you will find that as the first option here. Unfortunately, it does tend to get more complicated than that thanks to our friends over at the googleplex. While many people do not know this, there is a lot of additional things you can do to your email address if you have gmail. You can actually use + to basically append anything to your username (helpful when filtering emails later or creating a newly unique email). You can also use . for basically the same thing. If you suspect or want to allow users to do this, then look into the more advanced email addresses.

Basic Emails

This is your run of the mill email address.

/^([a-z0-9_\.-]+\@[\da-z-]+\.[a-z\.]{2,6})$/
Advanced Emails

This will capture advanced emails that use . or + in the email and will also capture if someone uses a subdomain as the domain of their email address. This can open you up to spam by allowing these emails, but it makes us nerds happy.

/^([a-z0-9_\.\+-]+\@[\da-z\.-]+\.[a-z\.]{2,6})$/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.