kaylaanngrace/urlRegexTutorial-ByMakWils

## urlRegexTutorial-ByMakWils
# urlRegexTutorial-ByMakWils

/^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w\.-]*)*\/?$/

This gist describes the compentents of matching a URL. Matching a URL is considered a regular expression or regex for short. A regex is a sequence of characters that defines a specific search pattern.

## Summary

Matching a URL is considered a regular expression or regex for short. A regex is a sequence of characters that defines a specific search pattern.

The following regex will match any valid URL:

/^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$/


## Table of Contents

- [Anchors](#anchors)
- [Quantifiers](#quantifiers)
- [Character Classes](#character-classes)
- [Grouping and Capturing](#grouping-and-capturing)
- [Bracket Expressions](#bracket-expressions)

## Regex Components
The expression may seem obscure at first, this tutorial will break it down in order to better understand the regex.

### Anchors
Anchors match a position within a string, not a character.

'^' - This anchor matches the beginning of a string.
'$' -  This anchor matches the end of a string.

### Quantifiers
Quantifiers indicate that the preceding token must be matched a certain number of times. Quantifiers, by default, are greedy meaning they will match as many characters as possible.

This regex has 4 quantifiers preceding the 4 capturing groups
'?' - This quantifier matches between 0 or 1 of the preceding token.
'*' - This is a quantifier that matches 0 or more of the preceding token.
'{2,6}' - This quantifier matches between 2 and 6 of the preceding token.
'+' - This is a quantifier that matches 1 or more of the preciding token.

### Character Classes
'\d' - This is a digit token and will match any digit character (0-9).
'\w' - This is a word token, which will match any word character, including alphanumetic and underscore.
'a-z' - This is a range and matches a character between the range of "a" to "z" and is case sensitive.
'\.' - This is an escaped character, which will match a "." character.
'\/' - This is an escaped character, whichh will match a "/" character.
'-' - This is a character. This matchesx a "-" character.

### Grouping and Capturing
Groups allow you to combine a sequence of tokens to handle them together.
'()' - Parentheses group multiple tokens together and create a capture group for extracting a substring.

This regex has 4 capturing groups.
1. (https?:\/\/) - the "h", "t", "t", "p", "s" ":" are literals. This will match the literal characters h, t, t, p, s and :
2. ([\da-z\.-]+) - this the domain name
3. ([a-z\.]{2,6}) - this is the top level domain ie .com, .gov, etc
4. ([\/\w \.-]*)- this the file path

Each part of the capturing groups are further described in this gist.

### Bracket Expressions / OR Operator
'[]' - brackets are character sets and will match any character or character class in the set.

### Greedy and Lazy Match
The quantifiers ( * + {}) are greedy operators, so they expand the match as far as they can through the provided values.

Using a (?) quanifier is considered a lazy operators.

## Author

Makayla Wilson
https://github.com/kaylaanngrace
	# urlRegexTutorial-ByMakWils

	/^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w\.-])\/?$/

	This gist describes the compentents of matching a URL. Matching a URL is considered a regular expression or regex for short. A regex is a sequence of characters that defines a specific search pattern.

	## Summary

	Matching a URL is considered a regular expression or regex for short. A regex is a sequence of characters that defines a specific search pattern.

	The following regex will match any valid URL:

	/^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-])\/?$/



	## Table of Contents

	- [Anchors](#anchors)
	- [Quantifiers](#quantifiers)
	- [Character Classes](#character-classes)
	- [Grouping and Capturing](#grouping-and-capturing)
	- [Bracket Expressions](#bracket-expressions)

	## Regex Components
	The expression may seem obscure at first, this tutorial will break it down in order to better understand the regex.

	### Anchors
	Anchors match a position within a string, not a character.

	'^' - This anchor matches the beginning of a string.
	'$' - This anchor matches the end of a string.

	### Quantifiers
	Quantifiers indicate that the preceding token must be matched a certain number of times. Quantifiers, by default, are greedy meaning they will match as many characters as possible.

	This regex has 4 quantifiers preceding the 4 capturing groups
	'?' - This quantifier matches between 0 or 1 of the preceding token.
	'*' - This is a quantifier that matches 0 or more of the preceding token.
	'{2,6}' - This quantifier matches between 2 and 6 of the preceding token.
	'+' - This is a quantifier that matches 1 or more of the preciding token.

	### Character Classes
	'\d' - This is a digit token and will match any digit character (0-9).
	'\w' - This is a word token, which will match any word character, including alphanumetic and underscore.
	'a-z' - This is a range and matches a character between the range of "a" to "z" and is case sensitive.
	'\.' - This is an escaped character, which will match a "." character.
	'\/' - This is an escaped character, whichh will match a "/" character.
	'-' - This is a character. This matchesx a "-" character.

	### Grouping and Capturing
	Groups allow you to combine a sequence of tokens to handle them together.
	'()' - Parentheses group multiple tokens together and create a capture group for extracting a substring.

	This regex has 4 capturing groups.
	1. (https?:\/\/) - the "h", "t", "t", "p", "s" ":" are literals. This will match the literal characters h, t, t, p, s and :
	2. ([\da-z\.-]+) - this the domain name
	3. ([a-z\.]{2,6}) - this is the top level domain ie .com, .gov, etc
	4. ([\/\w \.-]*)- this the file path

	Each part of the capturing groups are further described in this gist.

	### Bracket Expressions / OR Operator
	'[]' - brackets are character sets and will match any character or character class in the set.

	### Greedy and Lazy Match
	The quantifiers ( * + {}) are greedy operators, so they expand the match as far as they can through the provided values.

	Using a (?) quanifier is considered a lazy operators.

	## Author

	Makayla Wilson
	https://github.com/kaylaanngrace