Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
how to scrape images from 4chan using wget

How To Scrape Images from 4chan Using Wget

This guide is to save other sorry plebs from needing to RTFM in figuring out how to use wget to scrape images from 4chan and other imageboards. There are lots of image downloaders in existence, but they are usually outdated and broken. You will save time following this guide to learn how to use a powerful and general purpose tool instead.

What Is Wget?

Wget is a command-line file downloader that can handle just about any file downloading task normal and power users will ever need to do. It has versions available for Windows, Mac, and Linux. If it is not already installed on your machine, install it now.

Basic syntax

wget [options] [urls]

Useful Options for Image Scraping

There are tons more, but these are the most useful ones for this guide.

  • -r downloads files recursively, downloading links that are contained in already downloaded documents. This is essential because a common case is one url that contains all image file links.
  • -l [n] controls the maximum recursion level. n will practically always be one for image scraping.
  • -H allows downloads from different hosts than the original url. This is useful because many sites show images hosted at different domains.
  • -D [domains] tells what additional hosts to download from. You will probably have to 'View Source' in your browser to know for sure what to put here. domains is a comma separated list of domain names.
  • -P [prefix directory] tells where to save the downloaded files. The default is the current directory.
  • -nd avoids creating additional hierarchy.
  • -A [extensions] tells what file extensions to save.

Putting It Together

To download images from 4chan:

wget -P pictures -nd -r -l 1 -H -D i.4cdn.org -A png,gif,jpg,jpeg,webm [thread-url]

from 8chan:

wget -P pictures -nd -r -l 1 -H -D media.8ch.net -A png,gif,jpg,jpeg,webm [thread-url]

@ghost

This comment has been minimized.

Copy link

ghost commented Sep 16, 2018

Thank you very much! This works like a charm. ;)

@CodeAsm

This comment has been minimized.

Copy link

CodeAsm commented Oct 20, 2018

Thanks :D

@jonvonbasslake

This comment has been minimized.

Copy link

jonvonbasslake commented Dec 19, 2018

I'm a noob, and as such i can't seem to get this to work with 7chan, even though i'm putting the https://7chan.org/d/src/ where the images appear to be hosted AFAICT so what am i doing wrong? I used the exact same command and changed the url be the 7chan.org/d/src one...If someone can help me, that'd be great...

@ryankrage77

This comment has been minimized.

Copy link

ryankrage77 commented Aug 1, 2019

i.4cdn.org will give you low-res thumbnails.
For the full-res image, replace it with is2.4chan.org

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.