Skip to content

Instantly share code, notes, and snippets.

@bhurt
Created May 15, 2012 21:06
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bhurt/2705146 to your computer and use it in GitHub Desktop.
Save bhurt/2705146 to your computer and use it in GitHub Desktop.
Comment explaining a regex
;; Some documentation of this regular expression from hell, so I have
;; some hope of debugging it later.
;;
;; Start with the core "inner" regex:
;;
;; [\w\-]([\.\w])+[\w]+@([\w\-]+\.)+[A-Za-z]{2,4}
;;
;; This matches a "bare" email address, like bhurt@spnz.org. Then we
;; decorate it- we want to match the email adress even if it has a
;; name attached, like: "Brian Hurt" <bhurt@spnz.org>
;;
;; \"([^\"]|(\\.))*\"\s*<$bare$>
;;
;; where $bare$ is replaced with the bare-email matching regex above.
;; Except we want to match either a bare or decorated email address,
;; so it's really:
;;
;; ($bare$)|(\"([^\"]|(\\.))*\"\s*<$bare$>)
;;
;; Note that $bare$ is now duplicated twice. Now we want to match
;; a comma seperated sequence of decorated (including bare) email
;; addresses, so we do:
;;
;; ($deco$\s*,\s*)*$deco$
;;
;; where $deco$ is the decorated or bare email address matcher above.
;; Now we want to match a line that starts with "from:" followed by
;; a comma seperated list of email addresses, and nothing else:
;;
;; ^\s*from\s*:\s*($addrs$)\s*$
;;
;; where $addrs$ is our comma seperated list of email address matcher
;; above. Lastly, we want to set the following flags:
;; i = case insensitive matching
;; d = unix new lines
;; m = multiline matching
;;
;; so we prepend (?idm) to the regular expression.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment