Skip to content

Instantly share code, notes, and snippets.

@christiangella
Last active January 23, 2023 19:51
Show Gist options
  • Save christiangella/3bf0bd6f9283969826c083b8ad92b2c9 to your computer and use it in GitHub Desktop.
Save christiangella/3bf0bd6f9283969826c083b8ad92b2c9 to your computer and use it in GitHub Desktop.
Regex Validation of Hex Codes

Regex Validation of Hex Codes

This GitHub Gist will cover the Regular Expression methodology in validating hexadecimal colors codes.

A regex is a sequence of characters, or a string, that localizes and designates a specific pattern to a specific outcome. The utility of regex allows for the compression and streamlining of code into a smaller function, querying and modifying specific strings, and validating data against parameters. The specific function of regex within hex codes is to both consolidate code into a six-digit value and ensure that this value meets the defined parameters.

A hexadecimal is a numerical system with a base value of sixteen (base-16). They are represented by the values 0-9 (0,1,2,3,4,5,6,7,8,9) which represent the numerical values 0-9, and A-F (A,B,C,D,E,F) which represent the numerical values 10-15. One such application of hexadecimal values are hex color codes. These are six-digit hexadecimal values that are mapped to Red-Green-Blue (hereafter: RGB) values. Hex color codes, therefore, are a hexadecimal expression defined by the amount of red, green, and blue values in any particular code. The following codes are examples of hex codes:

#ffe240
#ffffff, or #fff
#000000, or #000

The hex code #ffe240 is a hexadecimal expression of the RBG values of (255,226,64). Each hex code can be thought of as a concatenation of three separate two-digit values that represent each RGB value.

The values #000000 and #ffffff express the two ends of the RBG spectrum as black and white respectively, representing the lowest and highest values. Because each of the RGB color values in these codes are used twice in each two-digit value, they can be further consolidated into #000 and #fff.

The values mapped to each RGB value are calculated for each two-digit value: ({valueOne}×16)+({valueTwo}×1). As such, in examining the hex code #ffe240:

  • ff is a consolidation of (15×16)+(15×1), resulting in the red value of 255.
  • e2 is a consolidation of (14×16)+(2×1), resulting in the green value of 226.
  • 40 is a consolidation of (4×16)+(0), resulting in the blue value of 64.

Summary

The regular expression that validates hexadecimal codes is as follows: /^#?([a-f0-9]{6}|[a-f0-9]{3})$/

When this regex takes in a string, it will determine if it meets the requirements (three or six digits, 0-9 and A-F values) to validate as a hex code.

This GitHub Gist will cover how Regex validates these hexadecimal notations that are used to store color by dissecting its individual parts.

Table of Contents

Regex Components

The regular expression of /^#?([a-f0-9]{6}|[a-f0-9]{3})$/ is comprised of the following components:

Anchors

The symbols caret (^) and the currency ($) define the environment for the regular expression. In other words, they determine the start and end of the regex parameters respectively and define the expression of what is needed to validate.

Individually, the caret ^ and currency ($) symbols serve the following functions:

  • The caret ^ ensures that the search will only evaluate values that contain whatever is specified in the beginning of the string. For example, ^dog will check to see if the string has dog at the beginning, and will not return any value if it does not contain dog at the beginning.
  • Conversely, the currency $ symbol ensures that the search will only evaluate values that contain whatever is specified at the end of the string. For example, cat$ will check to see if the string has cat at the end, and will not return any values if it does not contain cat at the end.

In this case, by using both the caret and currency symbol, the regex ensures that the string of which is being checked must match the regular expression.

In the case of this regular expression, the regex checks to make sure that the string passed through meets the criterion for /^#?([a-f0-9]{6}|[a-f0-9]{3})$/.

Quantifiers

There are three quantifiers in this regular expression, which specify the number of instances that the string must contain a specific value. These quantifiers are the ternary symbol (?) and the numeric quantifiers ({n} which are expressed twice as {6} and {3}). These quantifiers set the conditions for acceptable strings to be taken in provided they meet these conditions.

  • The ternary symbol (?) is a conditional quantifier that antecedes any character and determines if it matches it exactly zero or one time. It establishes a condition for a value to exist but does not necessitate it. For example, \w+l? will accommodate both the single and double consonant forms of fulfil and fulfill respectively, and o?\w will accept o- prefix Japanese honorifics (i.e. Okiku and Kiku).
  • The numeric quantifier {n} only accepts matches that meet the exact number of characters in a set, which is specified by the value n. Like the ternary symbol, numeric quantifiers also apply to any characters or grouping constructs that precede it. For example, when applied to the digit metacharacter as \d{5}, the quantifier will only match and accept strings with five digits in consecutive order such as 11111.

The regex expression of /^#?([a-f0-9]{6}|[a-f0-9]{3})$/ establishes these specific conditions to ensure that the strings for which it checks meet the criterion. The ternary symbol (?) checks to see if the pound symbol (#) precedes the hex code itself, meaning that both #FFFFFF and FFFFFF will be accepted. Both {6} and {3} ensure that the digit consists of either six or three values.

Grouping Constructs

There is one grouping construct indicated by the one set of parentheticals (()) that establishes a subexpression within the regex. A grouping construct has multiple functions. It can match specific values or subexpressions that also make use of expressions when searching for values that need to be specified, and they can allow for more complex interrogation of strings. Because there are two acceptable formats for strings that can pass through, the grouping construct houses the logic for what strings are acceptable.

Grouping constructs function by hosting an environment where the subexpression lives. However, they also serve a more specific purpose in creating an organizational methodology that separates expressions so their logic does not interfere. Should the regex /^#?([a-f0-9]{6}|[a-f0-9]{3})$/ not contain any grouping constructs and be presented as #?[a-f0-9]{6}|[a-f0-9]{3}$, values such as #ffff will be accepted because the logic for exact specificity will not be clear. Therefore, the grouping construct enables these subexpressions to exist without interference.

The () parenthetical in the case of /^#?([a-f0-9]{6}|[a-f0-9]{3})$/ defines the two acceptable stringsthat are stitched together by an an OR operator (|) and houses both of these subexpressions. The parenthetical grouping construct houses the subexpressions in ([a-f0-9]{6}|[a-f0-9]{3}), where we see both numeric quantifiers being used.

Bracket Expressions and Character Classes

The bracket expression ([]) defines the characters that are accepted within any given string that the regex examines. Unlike the grouping constructs, which are environments themselves, bracket expressions set the precise parameters for what can be taken in. The bracket expression defines exactly what character classes can be used and accepted.

  • A bracket expression of [x-z] will only consider strings with the characters x, y, and z. Bracket expressions can also include non-alphabetical values. A bracket expression of [.!*] will return only those values.

The regular expression /^#?([a-f0-9]{6}|[a-f0-9]{3})$/ has a specific bracket that is duplicated twice: [a-f0-9]. Within these bracket notations, the character classes are delineated to the characters a, b, c, d, e, f, 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9. Any other characters will not return any matches.

The OR Operator

An OR Operator (|) denotes when there are multiple possible outcomes or valid expressions. This enables the regular expression to have more complex functions.

  • An OR Operator can be used to validate multiple types of data. For example, (http|https) will return both unsecure and secure http routes.

The use of the OR Operator is to delineate multiple, valid forms of hex codes. As stated earlier, the hex code has a shorthand form when all three of the two-values in each RBG element are used twice. Both ([a-f0-9]{6} and ([a-f0-9]{3}, therefore, are valid acceptable matches. In order to set both as matches, the OR operator is used within the regular expression as a statement that both values can be returned.

Flags

There are six flags in JavaScript: i, g, m, s, u, and y. Each of these serve specific functions that alter what kind of strings are accepted by the regular expression. Most notably, i alters the regular expression to be make no distinction between uppercase and lowercase letters.

The regex /^#?([a-f0-9]{6}|[a-f0-9]{3})$/ does not use any flags. Hex codes, however have the ability to utilize capital letters. According to this function, however, this specific regex will only accept lowercase hex codes. There are two possible methods to circumvent this. By adding an i to the regex, the expression will capture capital letters. The capital letters could also be added to the regex itself for simplicity.

Flags enable specific functions, but hex codes make no use of anything but its set of sixteen characters. Therefore, the regex /^#?([a-f0-9]{6}|[a-f0-9]{3})$/ is economical in the fact that is maintains no flags.

Regex Methodology

The regex's components comprise its entire body and define its ability to validate hex codes. To break down the methodology of /^#?([a-f0-9]{6}|[a-f0-9]{3})$/:

  • ^ and $ set the beginning and end of the regular expression.
  • The ternary operator (?) checks to see if a pound symbol (#) exists at the beginning of the string. This means that the string must start with a pound symbol, but only one may exist.
  • The grouping construct (()) houses the two subexpressions.
    • The bracket constructors ([]) house all the accepted characters that can be taken in, [a-f0-9].
    • The bracket constructors are followed by a numeric quantifier ({n}), which are ({6}) and ({3}). This means that the string must be six or three characters long.
    • The OR Operator (|) sets the possibility for two possible validations, which are the two numeric quantifiers ({6}) and ({3}).

The following codes will be passed through the regex:

  • String: ffffff : Accepted.
    • The string passes all the criterion. It does not possess a pound symbol, and it consists of six characters that are within the accepted character classes.
  • String: #fff : Accepted.
    • The string passes all the criterion. It does possess a pound symbol, and it consists of three characters that are within the accepted character classes.
  • String: #FFF : Rejected.
    • The string does not meet the criterion. While it passes the criterion for having a pound symbol, and it consists of three characters that are within the accepted character classes, it has capital letters, which are not accepted by this particular regex.
  • String: @#ggggg : Rejected.
    • The string does not meet the criterion. It does not pass the criterion for starting with a pound symbol, and it consists of two characters that are within the accepted character classes. These characters are # and g.

Conclusion

This GitHub Gist makes understanding of the expression /^#?([a-f0-9]{6}|[a-f0-9]{3})$/ by dissecting and making sense of its individual parts. It ensures that values and strings passed through meet specific criterion that match hex code values.

Author

Manong Chris (they/he) is a web developer and digital illustrator. Click here to access their GitHub.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment