Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save wfellis/94e5695eb514bd3ad372d6bc56d6c3c8 to your computer and use it in GitHub Desktop.
Save wfellis/94e5695eb514bd3ad372d6bc56d6c3c8 to your computer and use it in GitHub Desktop.
Parler Data & Tools
Data & Tools:
Many contributors. Thanks to all.
ParlerAnalysis@protonmail.com - Do not expect timely replies.
Channel: #parlerparsers at https://webirc.hackint.org/
#parlerparsers-video for video IDing
FBI Tips: https://tips.fbi.gov/digitalmedia/aad18481a3e8f02
Want to help but don't know how?
Download copies of data and scripts. rehost them elsewhere, and seed torrents.
Help make this file easier for other to understand.
Develop ways to make data easy to visualize
Come ask in IRC about current efforts.
================================
(1) Metadata json files with EXIF data on all MP4 videos scraped from Parler:
donk.sh/metadata.tar.gz
magnet:?xt=urn:btih:1723e27bc79186c4574ff056ddb458d771c26e2f&dn=metadata.tar.gz&tr=wss%3A%2F%2Ftracker.btorrent.xyz&tr=wss%3A%2F%2Ftracker.openwebtorrent.com&tr=udp%3A%2F%2Ftracker.leechers-paradise.org%3A6969&tr=udp%3A%2F%2Ftracker.coppersurfer.tk%3A6969&tr=udp%3A%2F%2Ftracker.opentrackr.org%3A1337&tr=udp%3A%2F%2Fexplodie.org%3A6969&tr=udp%3A%2F%2
SHA256: 66809d9ae0a5a6577a3c80bb623562274ceccd96b35519f15f568d09cefc56f8 metadata.tar.gz
(2) Script to download WARCS from archive.org once they process:
https://github.com/ozywog/parler-data-tools
(3) Magnet URI for torrent of file that contains 1.8 million texts scraped from
Parler and is subet of full data. Originally hosted on https://parler-archive.deadops.de/
This is the parler_2020-01-06_posts-partial
magnet:?xt=urn:btih:FF29970B902657A32D561C0720E70FACFB8C4284&dn=parler_2020-01-06_posts-partial&tr=udp%3a%2f%2ftracker.opentrackr.org%3a1337%2fannounce&tr=udp%3a%2f%2ftracker.internetwarriors.net%3a1337%2fannounce
(4) Script to generate a list of unique names and usernames then collect all the
posts and associate them with the person who posted them
Requires raw html source:
https://github.com/billstrobl/Prooter
https://github.com/billstrobl/Prooter/blob/master/prooter.py
(5) Script to scrape videos: video scraper:
https://github.com/darthnithin/parlervideoscraper
You will need the metadata.tar.gz from (1) to use this
(6) JSON / CSV / KML Scrapes:
https://gofile.io/d/p8RxUC - CSV, with all non-zero lat/log from donk's josn
https://gofile.io/d/WVmqhR - quick 'n dirty KMLmade from the csv
View KML Data on map - See (9)
https://gofile.io/d/DsUUte - KML of posts made 1/6/2020, DC Area Only
https://gofile.io/d/EJczW8 - CSV, Cleaned ver of 1/6 in DC
https://gofile.io/d/PUxeV4 - CSV, Cleaned ver of all available gettagged data
https://gofile.io/d/zKTsWr - list of videos taken with 100m of a LE or gov't building, all-time
(7) Script to extract images/videos from WARCs:
https://gist.github.com/redd-dedd/9a200a9ba789f312faf53b25ac63e024
(8) Needs to be sorted.
http://donk.sh/06d639b2-0252-4b1e-883b-f275eff7e792/
https://web.archive.org/web/timemap/?url=https%3A%2F%2Fimage-cdn.parler.com%2F&matchType=prefix&collapse=urlkey&output=json&fl=original%2Cuniqcount&filter=!statuscode%3A%5B45%5D
https://irc.gammaspectra.live/eaa6fa678444b5f4/videos.txt
https://gist.github.com/kylemcdonald/8fdabd6526924012c1f5afe538d7dc09
https://github.com/acanthias13/legendary-octo-guacamole - backup of Clean CSVs
(9) Maps, both interactive and static heatmaps
kylemcdonald.net/parler/map/
https://fortress.maptive.com/ver4/a3486a6ab9a9a12aa9a9cb067839079c/410491
https://darthnithin.github.io/earth/index.html
===================================
Videos From DC Area, Jan 6th. Estimated to only be about 10% of what was available, at this moment
https://www.youtube.com/channel/UCZk6IiAVk2QwOdljEAYCPLw
https://mega.nz/file/Pkk2VSRT#x-Gnl1-FddGwHumBXAGsCJ2FL1VHE-Y-u2SFW48KpeQ
Some -notable Video IDs, list open to public contrib
https://docs.google.com/spreadsheets/d/1ThPUH5HgTcVKCoyfr2oJ21AWKTGq-dR-cRZjPOER-Q0/edit#gid=0
===================================
HOW TO VIEW WARC/ZSTD from ArchiveTeam's Parler scrape
# How to View Parler Archive "megawarc.warc.zst" files.
These are official zstd archive and warc standards.
They are uploading to: https://archive.org/details/archiveteam_neparlepas
$ tar -I zstd -xvf archive.tar.zst
===Old.
1. Install Python 3.7
2. Execute: pip install zstandard==0.10.2
3. Download archive from here: https://archive.org/details/archiveteam_neparlepas?tab=collection
4. Copy this script into a new file called xtract.py: https://hastebin.com/bugedubaxi.py
5. Execute: python ./xtract.py /path/to/parler_blahblah.megawarc.warc.zst > dict
6. Execute: zstd -d /path/to/parler_blahblah.megawarc.warc.zst -D dict
7. Import the decompressed parler_blahblah.megacarc.warc file into this tool: https://github.com/webrecorder/webrecorder-desktop
If you cannot install Python 3.7 for some reason, a dockerfile is available at:
https://gist.github.com/shoghicp/6ce05806ffc805929667ec2d4c62aba2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment