Skip to content

Instantly share code, notes, and snippets.

@andycwilliams
Last active September 8, 2021 21:34
Show Gist options
  • Save andycwilliams/f8bf3bf48b3fc24cb1d89d92a9459c47 to your computer and use it in GitHub Desktop.
Save andycwilliams/f8bf3bf48b3fc24cb1d89d92a9459c47 to your computer and use it in GitHub Desktop.
Regex tutorial on how to match a URL

How to Match a URL - Andy's Mode of Code

This is a regular expression (regex) tutorial to demonstrate basic definitions of commonly used parameters in programming. These expressions are powerful and flexible tools to either find or find-and-replace search patterns in strings.

Summary

Regex uses the following program to match a given URL: /^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$/

Using this line, we are able to determine the validity of the entered website address. This enables the programmer to validate URLs in practically any app or program that they need.

Table of Contents

Regex Components

The sequence of characters may look random and meaningless at first glance. However, by breaking it down, we can follow the underlying logic and usefulness in specifying our needs.

Anchors

The URL matching sequence is bracketed by two particular characters: ^ and the start and $ at the end. The rest of the code must be contained between these. When using both, we are telling the program to match the exact URL. However, there are a couple of ways to expand its functionality. By using just ^, we can tell it to match any URL that starts with our input. By using just $, we can tell it to match any URL the ends with our input.

Quantifiers

Qualifiers tell the program how many times it should identify a particular expression. For instance, ending with a ? tells it to only execute it a single time, whereas a * tells it to execute multiple times. Additionally, curly brackets {} tells the program to look for matches in multiple contexts.

Grouping Constructs

Grouping Constructs are determinde by regular parantheses (). These separate the particular groups of code that you are wanting to match. We can use multiple pairs to evaluate different parts we want to match.

Bracket Expressions

Bracket Expressions allow us to look for certain characters as determinde within []. For example, [abc] looks for a, b, or c within the string. We can also search for other factors, such as whether there's a number before a specific given character (shown by [0-9]%) or whether to match all letters regardless of case (shown by [a-zA-Z]).

Character Classes

There are two character classes for this regex. First is \d, which finds a given digit. Second is \w, which finds a given alphanumeric character.

The OR Operator

The OR Operator for this regex is the square bracket, []. This tells the program to match anything you place between the brackets.

Flags

Flags are optional and can be used to change the way the search is conducted. For instance, g (aka global) searches for every instance of a match rather than just one. Then, i (aka ignore) instructs the regex to ignore case sensitivity.

Character Escapes

Character Escapes are definde by the backslash \ to escape a special character, thus allowing you to use it as a literal. Whereas the double backslash \\ allows you to escape the backslash special character itself.

Author

Andy Williams is a future graduate of the University of Oregon Full-Stack Development Boot Camp. GitHub: https://github.com/andycwilliams

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment