Skip to content

Instantly share code, notes, and snippets.

@pauldub
Last active August 29, 2015 14:10
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save pauldub/0e3d6f0739d449bf221c to your computer and use it in GitHub Desktop.
Save pauldub/0e3d6f0739d449bf221c to your computer and use it in GitHub Desktop.
Simple Factor spider

Private API

Returns a season's url on the Audiodramax blog.

: season-url ( season -- url ) 
    "http://www.audiodramax.com/tag/eden-saison-%d" sprintf ;

Filters episodes from a list of urls. At the moment we are only interested on .mp3 files.

: filter-episode-urls ( page -- vector )
    [ path>> R/ .mp3/ re-contains? ] filter ;

Provides a Spider that will find all pages for a given season.

: <eden-season-spider> ( season -- spider )
    season-url <spider>
        t >>follow-robots?
        5 >>max-depth
        t >>quiet?
        1.5 seconds >>sleep
        4 >>#threads
        { [ path>> R/ .*page.[\d]/ matches? ] } >>filters ;

Public API

Downloads an Eden season to the specified directory by running the spider to find all the pages related to this season. I think that a cleaner interface would probably leave the make-directories and with-directory parts.

: eden-download-season ( season dest -- )
    dup make-directories
    [ <eden-season-spider> run-spider spidered>> values
      [ links>> filter-episode-urls ] map concat
      [ dup download-name exists? not [ download t ] when ] map drop 
    ] with-directory ;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment