Skip to content

Instantly share code, notes, and snippets.

@arjkb
Last active April 8, 2022 14:28
Show Gist options
  • Save arjkb/44c2257e0c4ccf609fb84ea27f7c9b28 to your computer and use it in GitHub Desktop.
Save arjkb/44c2257e0c4ccf609fb84ea27f7c9b28 to your computer and use it in GitHub Desktop.
Script to count how many XKCD posts exist.
import requests
def check(lower, upper):
mid = (lower + upper) // 2
if lower == mid or upper == mid:
# base case
return mid
url_to_test = 'https://xkcd.com/' + str(mid) + '/'
print('checking ' + url_to_test, end='')
status_code = requests.head(url_to_test).status_code
print(' ...got ' + str(status_code))
return check(mid, upper) if 200 <= status_code < 300 else check(lower, mid)
def main():
xkcd_count = check(0, 10000)
print('Count: ' + str(xkcd_count))
if __name__ == '__main__':
main()
@kevincox
Copy link

kevincox commented Apr 7, 2022

I love seeing clever approaches.

...however this one is a bit overthought.

curl https://xkcd.com | sed -nE 's_.* property="og:url" content="https://xkcd.com/([0-9]+)/.*_\1_p'

The homepage has the "canonical" URL with the comic ID in a "Permanent link to this comic" link and in the og:url meta tag. Other options are fetching the back link and adding one or grabbing the newest item for the RSS feed.

@arjkb
Copy link
Author

arjkb commented Apr 8, 2022

That's pretty cool; guess I should learn my tools better. Thanks for the tip!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment