Skip to content

Instantly share code, notes, and snippets.

@GreenFootballs
Last active August 31, 2022 11:12
Show Gist options
  • Save GreenFootballs/6731201fafc67ecc9322ccb4a7977018 to your computer and use it in GitHub Desktop.
Save GreenFootballs/6731201fafc67ecc9322ccb4a7977018 to your computer and use it in GitHub Desktop.
A PHP regular expression to match Amazon links and extract the ASIN identifier
~
    (?:(smile\.|www\.))?    # optionally starts with smile. or www.
    ama?zo?n\.              # also allow shortened amzn.com URLs
    (?:
        com                 # match all Amazon domains
        |
        ca
        |
        co\.uk
        |
        co\.jp
        |
        de
        |
        fr
    )
    /
    (?:                     # here comes the stuff before the ASIN
        exec/obidos/ASIN/   # the possible components of an Amazon URL
        |
        o/
        |
        gp/product/
        |
        (?:                 # the dp/ format may contain a title
            (?:[^"\'/]*)/   # anything but a slash or quote
        )?                  # optional
        dp/
        |                   # if amzn.com format, nothing before the ASIN
    )
    ([A-Z0-9]{10})          # capture group $2 will contain the ASIN
    (?:                     # everything after the ASIN
        (?:/|\?|\#)         # starting with a slash, question mark, or hash
        (?:[^"\'\s]*)       # everything up to a quote or white space
    )?                      # optional
~isx
@KenorFR
Copy link

KenorFR commented Jun 7, 2018

@mostmaz :
$asin = preg_replace($regex, '$2', $your_amazon_url);

or with preg_match

preg_match($regex, $your_amazon_url, $match);

if (isset($match[2]) && strlen($match[2]) == 10) {
    $asin = $match[2];
}

you can replace $2 / $match[2] by "1" if remove bracket :

(?:(smile\.|www\.))? > (?:smile\.|www\.)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment