Skip to content

Instantly share code, notes, and snippets.

@marcgg
Created December 8, 2010 17:26
Show Gist options
  • Star 61 You must be signed in to star a gist
  • Fork 9 You must be signed in to fork a gist
  • Save marcgg/733592 to your computer and use it in GitHub Desktop.
Save marcgg/733592 to your computer and use it in GitHub Desktop.
Regex to get the Facebook Page ID from a given URL
# Matches patterns such as:
# https://www.facebook.com/my_page_id => my_page_id
# http://www.facebook.com/my_page_id => my_page_id
# http://www.facebook.com/#!/my_page_id => my_page_id
# http://www.facebook.com/pages/Paris-France/Vanity-Url/123456?v=app_555 => 123456
# http://www.facebook.com/pages/Vanity-Url/45678 => 45678
# http://www.facebook.com/#!/page_with_1_number => page_with_1_number
# http://www.facebook.com/bounce_page#!/pages/Vanity-Url/45678 => 45678
# http://www.facebook.com/bounce_page#!/my_page_id?v=app_166292090072334 => my_page_id
# http://www.facebook.com/my.page.is.great => my.page.is.great
/(?:https?:\/\/)?(?:www\.)?facebook\.com\/(?:(?:\w)*#!\/)?(?:pages\/)?(?:[\w\-]*\/)*([\w\-\.]*)/
@neilsh
Copy link

neilsh commented Aug 28, 2012

I had to add some escaped periods to deal with urls like the ones below:

/(?:https?:\/\/)?(?:www\.)?facebook\.com\/(?:(?:\w\.)*#!\/)?(?:pages\/)?(?:[\w\-\.]*\/)*([\w\-\.]*)/

https://www.facebook.com/Babies.Fan.Page
https://www.facebook.com/pages/Babies.Fan.Page/121166161229757

@marcgg
Copy link
Author

marcgg commented Nov 6, 2012

@neilsh Yours doesn't catch patterns such as http://www.facebook.com/bounce_page#!/pages/Vanity-Url/45678

@marcgg
Copy link
Author

marcgg commented Nov 6, 2012

@neilsh I updated the regex to handle https://www.facebook.com/Babies.Fan.Page but not https://www.facebook.com/pages/Babies.Fan.Page/121166161229757 yet

@marcgg
Copy link
Author

marcgg commented Nov 6, 2012

@damusnet Fixed for handling https

@dougc84
Copy link

dougc84 commented Nov 12, 2012

updated to use this:

/^(https?:\/\/)?(?:www\.)?facebook\.com\/(?:(?:\w)*#!\/)?(?:pages\/)?(?:[\w\-]*\/)*([\w\-\.]*)/

it would allow ftp:// without the ^ and the ?:

@dougc84
Copy link

dougc84 commented Nov 12, 2012

errr... without the ^ and with the ?:

@manolof
Copy link

manolof commented Dec 12, 2012

This regex works great for english characters, but it doesn't work for non-english ones.

For example, it doesn't work for FB pages such as

http://www.facebook.com/pages/ΤΑ-ΦΡΟΥΤΑ-ΤΟΥ-ΔΑΣΟΥΣ/145298928829093

which when copied and pasted in a textbox from the browser url box returns as

http://www.facebook.com/pages/%CE%A4%CE%91-%CE%A6%CE%A1%CE%9F%CE%A5%CE%A4%CE%91-%CE%A4%CE%9F%CE%A5-%CE%94%CE%91%CE%A3%CE%9F%CE%A5%CE%A3/145298928829093

@marcgg
Copy link
Author

marcgg commented Feb 3, 2013

@Manalof Do you know how to update it to match those pages?

@michikono
Copy link

Doesn't work for paths with a trailing slash. Try adding a check for that (added (\/)?).

/(?:https?:\/\/)?(?:www\.)?facebook\.com\/(?:(?:\w)*#!\/)?(?:pages\/)?(?:[\w\-]*\/)*?(\/)?([\w\-\.]*)/

To address the UTF-8 comment: \b only does ASCII. To work with UTF-8, you need to define your own word boundaries.

The solution here is probably best to use an inverse character class ("anything that is not a slash or question mark") to find the usernames. This works in this situation since we know the only place special characters would appear is in the username.

@michikono
Copy link

Here's my attempt at filtering out other languages.

/(?:https?:\/\/)?(?:www\.)?facebook\.com\/(?:(?:\w)*#!\/)?(?:pages\/)?(?:[\w\-]*\/)*?(\/)?([^/?]*)/

Tested it against:

http://www.facebook.com/Φc?ref=hl => [2] = Φc
http://www.facebook.com/Φc/?ref=hl => [2] = Φc
http://www.facebook.com/pages/Φc/1234?ref=hl => [2] = Φc
http://www.facebook.com/pages/Φc/1234/?ref=hl => [2] = Φc

This also includes my forward slash escape code in my previous comment.

@campbell-codes
Copy link

Something you can do is two separate regex's to try and find a numeric ID first then on failure find the vanity id.
I've used this pretty basic regex to look for numbers of length 10 or greater:
/(\d{10,})/

then take the result if one is found and if one is not found use the original regex from here. Obviously this might fail if the name of the page is number1234567890 but that is a pretty special case.

I have found this to work for me pretty well but criticism welcome

Example URL:
https://www.facebook.com/pages/GHOST-Caf%C3%A9/627191887397533?fref=ts

@philippeluickx
Copy link

Anyone who wants this for Python:

https?://(www.)?facebook.com/(\w_#!/)?(pages/)?(([\w-]_/)*)?(?P<page_id>[\w.-]+)

@nkanaev
Copy link

nkanaev commented May 22, 2017

Matching fails if url contains closing slash, like https://www.facebook.com/my_page_id/
The regex below is more simpler working solution:

(?:https?:\/\/)?(?:www\.)?facebook\.com\/(?:.+\/)*([\w\.\-]+)

@Raphhh
Copy link

Raphhh commented Sep 18, 2017

@lekiend
Copy link

lekiend commented Dec 6, 2017

@BastienMottier
Just a little mistake. Unescaped slash at the end.
This below works better
^(?:https?:\/\/)?(?:www\.|m\.|touch\.)?(?:facebook\.com|fb(?:\.me|\.com))\/(?:(?:\w)*#!\/)?(?:pages\/)?(?:[\w\-]*\/)*?(\/)?([^/?\s]*)(?:\/|&|\?)?.*$

@msdinit
Copy link

msdinit commented Feb 28, 2018

@Raphhh
this one below also works for profile.php
^(?:https?:\/\/)?(?:www\.|m\.|touch\.)?(?:facebook\.com|fb(?:\.me|\.com))\/(?:(?:\w)*#!\/)?(?:pages\/)?(?:[\w\-]*\/)*?(?:\/)?(?:profile\.php\?id=)?([^/?\s]*)(?:\/|&|\?)?.*$

@arielperez82
Copy link

@lekiend There was one more missing slash.

Also, this below disallows host URLs ending in a slash with no profile e.g. https://www.facebook.com, http://fb.me, https://m.facebook.com/

^(?:https?://)?(?:www.|m.|touch.)?(?:facebook.com|fb(?:.me|.com))/(?!$)(?:(?:\w)#!/)?(?:pages/)?(?:[\w-]/)?(?:/)?(?:profile.php?id=)?([^\/?\s])(?:/|&|?)?.*$

@musasoftlabx
Copy link

where did u guys know how to write all these?

@ttodua
Copy link

ttodua commented Sep 16, 2018

doesnt work for Unicode containing pages, like this:

https://www.facebook.com/საწარმო-SabaDesign-927047470710565/?ref=safrghbeდფწერგ

@hoofdletterj
Copy link

Props for all contributers!!
Everything incorporated above, with just one more forgotten escape character added, gives me this:

/^(?:https?:\/\/)?(?:www\.|m\.|touch\.)?(?:facebook\.com|fb(?:\.me|\.com))\/(?!$)(?:(?:\w)*#!\/)?(?:pages\/)?(?:photo\.php\?fbid=)?(?:[\w\-]*\/)*?(?:\/)?(?:profile\.php\?id=)?([^\/?\s]*)(?:\/|&|\?)?.*$/

Which works GREAT (yay), except when the url has arguments after the profile.php?id= or fbid= part like these urls:

https://www.facebook.com/profile.php?id=114376375296751&fref=pb&hc_location=friends_tab
returns 114376375296751&fref=pb&hc_location=friends_tab instead of 114376375296751

and

https://www.facebook.com/photo.php?fbid=114376375296751&set=a.114376371963418.13845.114375165296872&type=1&theater
returns 114376375296751&set=a.114376371963418.13845.114375165296872&type=1&theater

Someone care to snip everything off after the first &?

@ayal
Copy link

ayal commented Jul 31, 2019

/^(?:https?://)?(?:www.|m.|touch.)?(?:facebook.com|fb(?:.me|.com))/(?!$)(?:(?:\w)#!/)?(?:pages/)?(?:photo.php?fbid=)?(?:[\w-]/)?(?:/)?(?:profile.php?id=)?([^\/?\&\s])(?:/|&|?)?.*?$/

this should exclude the & as well

@fabriciopirini
Copy link

Ayal's alternative didn't work for me. It worked when I got hoofdletterj's answer and added & before \s (Ayal's partial answer):

/^(?:https?:\/\/)?(?:www\.|m\.|touch\.)?(?:facebook\.com|fb(?:\.me|\.com))\/(?!$)(?:(?:\w)*#!\/)?(?:pages\/)?(?:photo\.php\?fbid=)?(?:[\w\-]*\/)*?(?:\/)?(?:profile\.php\?id=)?([^\/?&\s]*)(?:\/|&|\?)?.*$/

@WaqasAli853
Copy link

I want to display the post's images of Facebook on my website by just copy the address of the image is there any regex for that i have use the above regex but it doesn't help me
preg_match_all('/(https?://\S+.(?:jpg|png|gif))+/', $string, $match);
i am using this regex but it display all the other images except Facebook''s images

@alberto98fx
Copy link

I updated the regex to match even mbasic:

^(?:https?:\/\/)?(?:www\.|m\.|touch\.|mbasic\.)?(?:facebook\.com|fb(?:\.me|\.com))\/(?!$)(?:(?:\w)*#!\/)?(?:pages\/)?(?:photo\.php\?fbid=)?(?:[\w\-]*\/)*?(?:\/)?(?:profile\.php\?id=)?([^\/?&\s]*)(?:\/|&|\?)?.*$

It works for urls like:

https://mbasic.facebook.com/BMW/?refid=46&__xts__%5B0%5D=12.%7B%22unit_id_click_type%22%3A%22graph_search_results_item_tapped%22%2C%22click_type%22%3A%22result%22%2C%22module_id%22%3A2%2C%22result_id%22%3A22893372268%2C%22session_id%22%3A%22e4709b011e94ec8207a44ffedd1d2901%22%2C%22module_role%22%3A%22ENTITY_PAGES%22%2C%22unit_id%22%3A%22browse_rl%3Ab2718be4-bbd0-4764-9c31-6908c431daa2%22%2C%22browse_result_type%22%3A%22browse_type_page%22%2C%22unit_id_result_id%22%3A22893372268%2C%22module_result_position%22%3A0%7D

@alberto98fx
Copy link

There's also mobile.facebook.com, so here's the new regex:

^(?:https?:\/\/)?(?:www\.|m\.|mobile\.|touch\.|mbasic\.)?(?:facebook\.com|fb(?:\.me|\.com))\/(?!$)(?:(?:\w)*#!\/)?(?:pages\/)?(?:photo\.php\?fbid=)?(?:[\w\-]*\/)*?(?:\/)?(?:profile\.php\?id=)?([^\/?&\s]*)(?:\/|&|\?)?.*$

Which can match stuff like:

https://mobile.facebook.com/BMW/

@xtvipxtt
Copy link

@beshoo
Copy link

beshoo commented Apr 11, 2021

(?:https?:\/\/)?(?:www\.|m\.|mobile\.|touch\.|mbasic\.)?(?:facebook\.com|fb(?:\.me|\.com))\/(?!$)(?:(?:\w)*#!\/)?(?:pages\/|pg\/)?(?:photo\.php\?fbid=)?(?:[\w\-]*\/)*?(?:\/)?(?:profile\.php\?id=)?([^\/?&\s]*)(?:\/|&|\?)?.*

This will match /pg/ URL as weel
https://m.facebook.com/pg/DwayneTheRockJohnsonFanClub/photos/
...................................................^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment