Skip to content

Instantly share code, notes, and snippets.

@cjaoude
Last active April 4, 2024 18:17
Show Gist options
  • Save cjaoude/fd9910626629b53c4d25 to your computer and use it in GitHub Desktop.
Save cjaoude/fd9910626629b53c4d25 to your computer and use it in GitHub Desktop.
Test list of Valid and Invalid Email addresses
Use: for testing against email regex
ref: http://codefool.tumblr.com/post/15288874550/list-of-valid-and-invalid-email-addresses
List of Valid Email Addresses
email@example.com
firstname.lastname@example.com
email@subdomain.example.com
firstname+lastname@example.com
email@123.123.123.123
email@[123.123.123.123]
"email"@example.com
1234567890@example.com
email@example-one.com
_______@example.com
email@example.name
email@example.museum
email@example.co.jp
firstname-lastname@example.com
List of Strange Valid Email Addresses
much.”more\ unusual”@example.com
very.unusual.”@”.unusual.com@example.com
very.”(),:;<>[]”.VERY.”very@\\ "very”.unusual@strange.example.com
List of Invalid Email Addresses
plainaddress
#@%^%#$@#$@#.com
@example.com
Joe Smith <email@example.com>
email.example.com
email@example@example.com
.email@example.com
email.@example.com
email..email@example.com
あいうえお@example.com
email@example.com (Joe Smith)
email@example
email@-example.com
email@example.web
email@111.222.333.44444
email@example..com
Abc..123@example.com
List of Strange Invalid Email Addresses
”(),:;<>[\]@example.com
just”not”right@example.com
this\ is"really"not\allowed@example.com
@ReedRodgers
Copy link

ok

@IBUPltAppIconfont
Copy link

image

@allanvobraun
Copy link

ok

@TokerX
Copy link

TokerX commented Oct 12, 2021

ok

But
much.”more\ unusual”@example.com and very.”(),:;<>[]”.VERY.”very@\ "very”.unusual@strange.example.com can't be valid. Within a quoted string backslashes and quotes still need to be escaped with a backslash. I just noticed that they will be removed when posting though lol, so they probably were typed correctly. Anyway, so the first one needs 2 backslashes after more and the second one needs to have two backslashes after very@ and a backslash before "very".unusual.
For the same reason just”not”right@example.com probably turned from an invalid one into a valid one lol.

@kyleishie
Copy link

ok

@juliantejera
Copy link

ok

@jcf-dev
Copy link

jcf-dev commented Nov 17, 2021

ok

@CoolBeans-Dev
Copy link

ok

@Crashingberries
Copy link

ok

@pynner
Copy link

pynner commented Dec 1, 2021

ok

@jlalmes
Copy link

jlalmes commented Dec 3, 2021

ok

@guilhermerodrigues680
Copy link

ok

@Heroco
Copy link

Heroco commented Dec 7, 2021

ok

@mweiss9676
Copy link

ok

@william-lively
Copy link

ok

@GuiRosaAlves
Copy link

ok

@ekscrypto
Copy link

ekscrypto commented Jan 22, 2022

I'll argue that if you list email@example.web as an invalid email address then you should also list email@123.123.123.123 has invalid. As per RFC5321 Section 4.1.2 address literals can only be used if they start with "[" and end with "]", and .123 is not a valid TLD as per Public Suffix List.

Also based on that same RFC5321 I would argue that your list of very unusual email addresses are actually all invalid. Local-part definition clearly indicate that you either have a Dot-string or a Quoted-string, not both and not a combination of both. None of your very unusual emails start with DQUOTE therefore would fall under the Dot-string validation rule which only allow Atom *("." Atom), atom being only 1*atext

@bw-varun
Copy link

ok

@julien-amblard
Copy link

ok

@Arifdamar
Copy link

ok

@DogeMastr
Copy link

ok

@danielwoerner
Copy link

ok

@FlorianFlatscher
Copy link

ok

@sunethshehan
Copy link

nice-click-nice

@thurst28
Copy link

thurst28 commented Jul 1, 2022

ok

@AndyJPhillips-lively
Copy link

ok

@romarioschneider
Copy link

ok

@MetaMmodern
Copy link

ok

@micahbf
Copy link

micahbf commented Jul 26, 2022

ok

@KthProg
Copy link

KthProg commented Jul 27, 2022

ok

@jmadelaine
Copy link

ok

@johnd3v
Copy link

johnd3v commented Aug 17, 2022

ok

@ooyendyk
Copy link

ok

@maxpushka
Copy link

ok

@manucabral
Copy link

ok

@JulianBissekkou
Copy link

ok

@stefanschaller
Copy link

ok

@ckpanchal
Copy link

ok

@ndelanou
Copy link

ok

@AlyanQ
Copy link

AlyanQ commented Oct 14, 2022

ok

@Diaz-adrianz
Copy link

ok

@KevzPeter
Copy link

ok

@randrasek-eb
Copy link

ok

@zackpi
Copy link

zackpi commented Nov 16, 2022

ok

@lucastrajtenberg
Copy link

ok

@cizordj
Copy link

cizordj commented Nov 17, 2022

ok

@Manuel-Zapp-Studio
Copy link

ok

@stefanschaller
Copy link

ok

@dhoepfl
Copy link

dhoepfl commented Nov 17, 2022

email@123.123.123.123
email@[123.123.123.123]
these two mail ids are Invalid.

According to RFC5322:

Parsing "email" local-part

local-part = dot-atom
dot-atom = dot-atom-text
dot-atom-text = 1*atext
atext = ALPHA

"email" is a valid local-part

Parsing "123.123.123.123" domain

domain = dot-atom
dot-atom = dot-atom-text
dot-atom-text = 1*atext *("." 1*atext)
atext = DIGIT

"123" is a valid 1*atext, so "123.123.123.123" is a valid domain.

Parsing "[123.123.123.123]" domain

domain = domain-literal
domain-literal = "[" *(dtext) "]"
dtext = %d49 / %d50 / %d51

"123" is a valid *(dtext), so "[123.123.123.123]" is a valid domain.

@dhoepfl
Copy link

dhoepfl commented Nov 17, 2022

"email@example.com (Joe Smith)" is valid according to RFC5322:

(local-part see previous comment)

domain = obs-domain
obs-domain = atom *("." atom)
atom = 1*atext [CFWS]
atext = ALPHA
CFWS = 1*([FWS] comment)
comment = "(" *([FWS] ccontent) ")"
ccontent = ctext
ctext = ... ; US-ASCII

"example" matches as atom
"com (Joe Smith)" matches as atext+CFWS
("com" is 1*atext, " " is FWS, "(Joe Smith)" is comment)

@dhoepfl
Copy link

dhoepfl commented Nov 17, 2022

That being said, I think that "email@123.123.123.123" and "email@example.com (Joe Smith)" are invalid according to RFC5321.
(Sorry, the parser I implemented currently only checks for RFC5322 compliance.)

@jsaxon-cars
Copy link

Where's the regex to validate this list???

@micahbf
Copy link

micahbf commented Nov 23, 2022

@RohanNagar
Copy link

I tested this list with JMail and other Java email address validation libraries: https://www.rohannagar.com/jmail

@cizordj
Copy link

cizordj commented Dec 6, 2022

How many RFCs exist to validate emails? There is another one used the by PHP language to validate emails → rfc822

@ekscrypto
Copy link

ekscrypto commented Dec 10, 2022 via email

@cizordj
Copy link

cizordj commented Dec 11, 2022

@ekscrypto Thanks for answering, this is a really interesting topic for me.

@blueblakk
Copy link

ok

@Neilord
Copy link

Neilord commented Jan 25, 2023

ok

@SuperPauly
Copy link

ok

@PenguinDan
Copy link

ok

@FlorianFlatscher
Copy link

ok

@shavezagicent
Copy link

ok

@Nekromateion
Copy link

ok

@Sighyu
Copy link

Sighyu commented Mar 11, 2023

ok

@RobKenis
Copy link

ok

@martinsotirov
Copy link

hmmm.. ok I guess?

@frattaro
Copy link

frattaro commented Apr 4, 2023

ok

IMO:

Regarding the local part, the original rfc822 says:

... The domain-dependent string is uninterpreted,
except by the final sub-domain; the rest of the mail
service merely transmits it as a literal string.

And that's the way the world works, RFCs don't send emails. A "valid" local part is completely up to the email server hosting the address. The only limitations on that are the arbitrary ones placed there by your email-sending pipeline.

Max-length is interesting, since the internet supports utf8 now, a "character" can be up to 4 bytes. With a limit of 256 bytes in the domain part, without parsing to punycode, safest bet is to limit it to 63 characters. That way you don't have to worry about subdomains being more than 63 characters either, since they need to fit the whole thing in there.

So your final validation should:

  1. split at the final @-symbol
  2. local part between 1 and 16000 characters (DB varchar(max) is 65535 bytes, assume all unicode and leave room for domain)
  3. domain part between 4 and 63 characters
  4. subdomains can't start or end with "-"
  5. domain part matches one of: DNS, ipv4 or ipv6 regex
// These are the regexes I landed on for javascript:

// The following letter sets are added because wikipedia insists
// they're valid email addresses, so, should be included in /p{L} but aren't:
// Hindi character set: \u0900-\u097F
// Kannada character set: \u0C80-\u0CFF
const domainRegex =
  /^((?!-)[\p{L}\p{N}\u0900-\u097F\u0C80-\u0CFF-]+(?<!-)\.)+[\p{L}\u0900-\u097F\u0C80-\u0CFF]{2,}$/iu;
const ipv4Regex =
  /^\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\]$/;
const ipv6Regex =
  /^\[ipv6:(([0-9a-fA-F]{1,4}:){7,7}[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,7}:|([0-9a-fA-F]{1,4}:){1,6}:[0-9a-fA-F]{1,4}|([0-9a-fA-F]{1,4}:){1,5}(:[0-9a-fA-F]{1,4}){1,2}|([0-9a-fA-F]{1,4}:){1,4}(:[0-9a-fA-F]{1,4}){1,3}|([0-9a-fA-F]{1,4}:){1,3}(:[0-9a-fA-F]{1,4}){1,4}|([0-9a-fA-F]{1,4}:){1,2}(:[0-9a-fA-F]{1,4}){1,5}|[0-9a-fA-F]{1,4}:((:[0-9a-fA-F]{1,4}){1,6})|:((:[0-9a-fA-F]{1,4}){1,7}|:)|fe80:(:[0-9a-fA-F]{0,4}){0,4}%[0-9a-zA-Z]{1,}|::(ffff(:0{1,4}){0,1}:){0,1}((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])|([0-9a-fA-F]{1,4}:){1,4}:((25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9])\.){3,3}(25[0-5]|(2[0-4]|1{0,1}[0-9]){0,1}[0-9]))\]$/i;

@WilliamAnderson-at
Copy link

ok

@thurst28
Copy link

thurst28 commented Apr 4, 2023

ok

@kinduff
Copy link

kinduff commented Apr 5, 2023

ok

@Szarlej4
Copy link

Szarlej4 commented May 4, 2023

ok

@carlosmoran
Copy link

ok

@RubenMateus
Copy link

ok

@jamesmacfie
Copy link

ok

@qugu2427
Copy link

ok

@jackmahoney
Copy link

Don't forget names and utf8:

=?utf-8?B?R0FSQU5UIEJDIERNQ0M=?= <2342431@mycompany.com>

@WilliamCJacobsen
Copy link

ok

@FilipFeldberg
Copy link

ok

@Andorr
Copy link

Andorr commented Sep 14, 2023

ok

@SveinungOverland
Copy link

ok

@vdsingh
Copy link

vdsingh commented Oct 22, 2023

ok

@cizordj
Copy link

cizordj commented Oct 31, 2023

@frattaro If this is a 'humble' opinion, I don't want to know the 'supreme' opinion

@RobKenis
Copy link

RobKenis commented Nov 7, 2023

@cizordj It was never mentioned that the opinion was humble. I think it was meant to be supreme from the beginning.

@cizordj
Copy link

cizordj commented Nov 7, 2023

@RobKenis he literally said IMO 🐱

@micahbf
Copy link

micahbf commented Nov 7, 2023

@cizordj And what does IMO stand for

@KthProg
Copy link

KthProg commented Nov 7, 2023

@cizordj And what does IMO stand for

"In My Office"

@OmarJalil
Copy link

OmarJalil commented Nov 7, 2023

@cizordj And what does IMO stand for

"In My Office"

"Iguanas Making Omelets"

@kinduff
Copy link

kinduff commented Nov 7, 2023

@cizordj And what does IMO stand for

"In My Office"

"Iguanas Making Omelets"

"Invisible Martian Orchestras"

@frattaro
Copy link

frattaro commented Nov 7, 2023

ok

I added "IMO" - "In my opinion" - lastly after writing that, because I finished and thought "you know, it wouldn't be that much more work to measure the byte lengths of the strings... and I'd bet you $5 that regex is missing some unicode ranges. I should probably package this up. I'm not gonna do that. You know what? I'll just say it's an opinion because it's good enough and I'm done with it."

@micahbf
Copy link

micahbf commented Nov 7, 2023

ok

@kinduff
Copy link

kinduff commented Nov 7, 2023

@frattaro tldr

@frattaro
Copy link

frattaro commented Nov 7, 2023

@isthisstackoverflow
Copy link

@cizordj And what does IMO stand for

"In My Office"

"Iguanas Making Omelets"

"Invisible Martian Orchestras"

Interpolate my onions.

@jcf-dev
Copy link

jcf-dev commented Nov 7, 2023

ok

I added "IMO" - "In my opinion" - lastly after writing that, because I finished and thought "you know, it wouldn't be that much more work to measure the byte lengths of the strings... and I'd bet you $5 that regex is missing some unicode ranges. I should probably package this up. I'm not gonna do that. You know what? I'll just say it's an opinion because it's good enough and I'm done with it."

ok

@johnd3v
Copy link

johnd3v commented Nov 7, 2023

@cizordj And what does IMO stand for

"In My Office"

"Iguanas Making Omelets"

"Invisible Martian Orchestras"

Interpolate my onions.

Iced Mocha Overload

@jcf-dev
Copy link

jcf-dev commented Nov 7, 2023

@cizordj And what does IMO stand for

Intellectuals Meeting Ogres

@chiperific
Copy link

This answers all the regex questions about email: https://youtu.be/mrGfahzt-4Q?feature=shared&t=992

@KthProg
Copy link

KthProg commented Nov 20, 2023

This answers all the regex questions about email: https://youtu.be/mrGfahzt-4Q?feature=shared&t=992

We don't want answers about email, we want answers about what IMO stands for

@jcf-dev
Copy link

jcf-dev commented Nov 20, 2023

This answers all the regex questions about email: https://youtu.be/mrGfahzt-4Q?feature=shared&t=992

We don't want answers about email, we want answers about what IMO stands for

It's Means Ok

@cizordj
Copy link

cizordj commented Nov 23, 2023

This is why English can be confusing sometimes, specially for foreigners.

@igokom
Copy link

igokom commented Nov 29, 2023

really ok

@brycehazen
Copy link

ok

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment