~
(?:(smile\.|www\.))? # optionally starts with smile. or www.
ama?zo?n\. # also allow shortened amzn.com URLs
(?:
com # match all Amazon domains
|
ca
|
co\.uk
|
co\.jp
|
de
|
fr
)
/
(?: # here comes the stuff before the ASIN
exec/obidos/ASIN/ # the possible components of an Amazon URL
|
o/
|
gp/product/
|
(?: # the dp/ format may contain a title
(?:[^"\'/]*)/ # anything but a slash or quote
)? # optional
dp/
| # if amzn.com format, nothing before the ASIN
)
([A-Z0-9]{10}) # capture group $2 will contain the ASIN
(?: # everything after the ASIN
(?:/|\?|\#) # starting with a slash, question mark, or hash
(?:[^"\'\s]*) # everything up to a quote or white space
)? # optional
~isx
-
-
Save brianlayman/9071ebd13b926b8a0cf9ae1c05ce035c to your computer and use it in GitHub Desktop.
A PHP regular expression to match Amazon links and extract the ASIN identifier
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment