Skip to content

Instantly share code, notes, and snippets.

@InnaFedorenko
Last active July 19, 2023 18:45
Show Gist options
  • Save InnaFedorenko/dcc7e01c1dcedb2d7adecf7f9ad36563 to your computer and use it in GitHub Desktop.
Save InnaFedorenko/dcc7e01c1dcedb2d7adecf7f9ad36563 to your computer and use it in GitHub Desktop.
This is Regex tutorial from IF.

Regex Tutorial

As a web developer, you'll frequently encounter situations where you need to search, extract, or manipulate specific patterns in text data. Regular expressions (regex) provide a powerful and flexible solution for tackling such tasks. Regex allows you to define search patterns using a combination of symbols, characters, and operators, giving you the ability to match complex patterns with ease.

Table of Contents

Summary

Regex, short for regular expressions, is a powerful tool used for pattern matching and text manipulation. It provides a concise and flexible way to search, match, and replace text based on specified patterns. Regex features typically vary slightly depending on the specific programming language or tool you are using, but below are some common features.

Anchors

Anchors are used to match positions in the input string rather than matching characters. They specify the location of a pattern in the text.

Examples of anchors:

  • ^ - The start anchor, matches the beginning of the string.
  • $ - The end anchor, matches the end of the string.
  • \b - The word boundary anchor, matches the position between a word character and a non-word character.

Quantifiers

Quantifiers control the number of times a character or a group can appear in the input.

Examples of quantifiers:

  • * - Matches zero or more occurrences of the preceding character or group.
  • + - Matches one or more occurrences of the preceding character or group.
  • ? - Matches zero or one occurrence of the preceding character or group.
  • {n} - Matches exactly 'n' occurrences of the preceding character or group.
  • {n,} - Matches 'n' or more occurrences of the preceding character or group.
  • {n,m} - Matches between 'n' and 'm' occurrences of the preceding character or group.

Grouping Constructs

Grouping constructs are used to create subexpressions within a regex pattern. They are enclosed within parentheses.

Examples of grouping constructs:

  • () - Creates a capturing group to extract the matched substring.
  • (?:) - Creates a non-capturing group for grouping without extraction.
  • (?<name>) - Creates a named capturing group for easy reference.

Bracket Expressions

Bracket expressions, also known as character classes, are used to match any single character from a set of characters.

Examples of bracket expressions:

  • [abc] - Matches any single character 'a', 'b', or 'c'.
  • [0-9] - Matches any single digit.
  • [^a-z] - Negated character class, matches any character except lowercase letters.

Character Classes

Character classes are shortcuts for commonly used sets of characters.

Examples of character classes:

  • \d - Matches any digit (equivalent to [0-9]).
  • \w - Matches any word character (alphanumeric character plus underscore).
  • \s - Matches any whitespace character.
  • \D - Matches any non-digit character.
  • \W - Matches any non-word character.
  • \S - Matches any non-whitespace character.

The OR Operator

The OR operator allows you to match one expression or another.

Example:

  • apple|orange - Matches either "apple" or "orange".

Flags

Flags are used to modify the behavior of a regex pattern.

Examples of flags:

  • g - Global flag, matches all occurrences.
  • i - Case-insensitive flag, ignores letter case.
  • m - Multi-line flag, changes behavior of ^ and $ to match the start and end of lines.

Character Escapes

Character escapes are used to match characters with special meanings in regex.

Examples of character escapes:

  • \. - Matches a literal period (dot).
  • \\ - Matches a literal backslash.
  • \t - Matches a tab character.
  • \n - Matches a newline character.

Usage

Regular expressions, commonly known as regex, are a powerful tool for performing pattern matching and text manipulation tasks. Regex allows you to define specific patterns and search for matches within strings. In this section, we'll explore various examples of how regex can be applied in different scenarios.

  • Matching a Hex Value: /^#?([a-f0-9]{6}|[a-f0-9]{3})$/ This regex pattern matches both 6-digit and 3-digit hexadecimal color values, with or without the leading "#" symbol. Here are some examples:
    "#1a2b3c" - Valid hex value with leading "#".
    "1a2b3c" - Valid hex value without leading "#".
    "#abc" - Valid short-form hex value with leading "#".
    "abc" - Valid short-form hex value without leading "#".
    "#GHIJKL" - Invalid hex value, as it contains characters other than a-f and 0-9.
    "#12345" - Invalid hex value, as it does not have the correct number of digits.
  • Matching an Email: /^([a-z0-9_.-]+)@([\da-z.-]+).([a-z.]{2,6})$/. This regex pattern validates email addresses and ensures they adhere to the typical format. Some examples:
    "example@example.com" - Valid email address.
    "name.lastname@example.co.uk" - Valid email address with a multi-level domain.
    "user@domain" - Invalid email address, as it does not have a top-level domain.
    "user..name@example.com" - Invalid email address, as it has consecutive dots in the username.
    "user@-domain.com" - Invalid email address, as it starts with a hyphen.
  • Matching a URL: /^(https?://)?([\da-z.-]+).([a-z.]{2,6})([/\w .-])/?$/ This regex pattern matches URLs, allowing both HTTP and HTTPS protocols. Examples:
    "https://www.example.com" - Valid URL with HTTPS protocol.
    "http://sub.example.co.uk/page" - Valid URL with HTTP and multi-level domain.
    "www.example.com" - Invalid URL, as it lacks the protocol.
    "http://domain.123" - Invalid URL, as the top-level domain must have at least two characters.
  • Matching an HTML Tag: /^<([a-z]+)([^<]+)(?:>(.)</\1>|\s+/>)$/. This regex pattern matches HTML tags, both self-closing and paired. Examples:
    "<div>Some text here</div>" - Valid paired HTML tag with content.
    "<br />" - Valid self-closing HTML tag.
    "<p class='text'>Paragraph</p>" - Valid paired HTML tag with attributes.
    "<invalid>" - Invalid HTML tag, as it's not a recognized HTML element.
    "<p>Unclosed paragraph tag" - Invalid HTML tag, as it's not properly closed.

Author

This tutorial was written by [Inna Fedorenko], a passionate web development student. You can find more of their work on GitHub.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment