Skip to content

Instantly share code, notes, and snippets.

@bchauSW
Last active December 4, 2020 14:58
Show Gist options
  • Save bchauSW/5eded760df213443e16b837515f542c4 to your computer and use it in GitHub Desktop.
Save bchauSW/5eded760df213443e16b837515f542c4 to your computer and use it in GitHub Desktop.
This script uses the PRAW library to extract title data from https://www.reddit.com/r/r4r30Plus and format the data into a CSV file.
import praw
import re
from collections import Counter
reddit = praw.Reddit(client_id='clientid',
client_secret='clientsecret',
password='password',
user_agent='Script by me',
username='username')
titles = []
for submission in reddit.subreddit('R4R30PLUS').top(limit=1000):
titles.append(submission.title)
ages = []
sexes = []
descriptions = []
for title in titles:
m = re.search('(\d+).*(\[[MFARTmfart]+4[MFARTmfart]+\])(.*)', title)
if m:
ages.append(int(m.group(1)))
sexes.append(m.group(2).upper())
descriptions.append(m.group(3))
with open('ages.csv', 'w+') as f:
ages = Counter(ages).most_common()
for k, v in ages:
f.write((str(k) + ',' + str(v) + '\n'))
with open('sexes.csv', 'w+') as f:
sexes = Counter(sexes).most_common()
for k, v in sexes:
f.write((str(k).upper() + ',' + str(v) + '\n'))
with open('descriptions.csv', 'w+') as f:
for desc in descriptions:
f.write((desc + '\n'))
@cept0n
Copy link

cept0n commented Dec 4, 2020

Pog

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment