Skip to content

Instantly share code, notes, and snippets.

@pistocop
Last active February 11, 2021 17:25
Show Gist options
  • Save pistocop/e326e789887a49cffa634053430316e5 to your computer and use it in GitHub Desktop.
Save pistocop/e326e789887a49cffa634053430316e5 to your computer and use it in GitHub Desktop.
subreddit-text-downloader installer
# Init
$ git clone https://github.com/pistocop/subreddit-comments-dl.git
$ cd subreddit-comments-dl
$ pip install -r requirements.txt
# Download the AskReddit comments of the last 30 submissions
$ python src/subreddit_downloader.py AskReddit --batch-size 10 --laps 3 --reddit-id <reddit_id> --reddit-secret <reddit_secret> --reddit-username <reddit_username>
# Download the News comments after 1 January 2021
$ python src/subreddit_downloader.py AskReddit --batch-size 512 --laps 3 --reddit-id <reddit_id> --reddit-secret <reddit_secret> --reddit-username <reddit_username> --utc-after 1609459200
# Build the dataset and check the results under `./dataset/` path
$ python src/dataset_builder.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment