Skip to content

Instantly share code, notes, and snippets.

@adulau
Last active April 6, 2021 07:00
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save adulau/e09c12c2706c51656ab67a1d01d2d4bf to your computer and use it in GitHub Desktop.
Save adulau/e09c12c2706c51656ab67a1d01d2d4bf to your computer and use it in GitHub Desktop.
Facebook 533m leak - analysis

Warning: Analysis is based on the data leaked and subject to interpretation

Format

The original leak contains a zip with various files Zip per "country" with typographic errors and geographic errors. Some files are rar and 7z too.

CSV headers

There are multiple inconsistencies of position and size in the various contry files (merged from different sources?).

The most common structure is the following.

Name Position Notes
Phone number 1 (including International code)
Facebook ID 2
First Name 3
Last Name 4
Sex 5 male,female
City? 6
Province 7
Marital Status 8
Workplace 9
Creation date 10
Email 11
DoB 12

Origin

The origin seems to be a brute-force of all phone numbers per operators via the Facebook account recovery process (vulnerability disclosed in July 2017). A large number of mobile phones are continuous for some know mobile operator pools. Nevertheless some (less than 10%) entries are unrelated phones and with less data. It could be possible that the leak is composed of mixed sources obtain from different means.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment