Skip to content

Instantly share code, notes, and snippets.

@eliasdabbas
Created June 1, 2024 20:00
Show Gist options
  • Save eliasdabbas/56484c7b8aefd638f2389fda4e687a71 to your computer and use it in GitHub Desktop.
Save eliasdabbas/56484c7b8aefd638f2389fda4e687a71 to your computer and use it in GitHub Desktop.
Setting up a daily status code checker
import datetime
import advertools as adv
today = datetime.datetime.now(datetime.UTC).strftime("%Y_%m_%d")
sitemap = adv.sitemap_to_df("https://example/sitemap.xml")
adv.crawl_headers(
sitemap["loc"],
f"path/to/status_codes/{today}.jl",
custom_settings={
"AUTOTHROTTLE_ENABLED": False,
"LOG_FILE": f"path/to/status_codes_logs/{today}.jl",
},
)
@eliasdabbas
Copy link
Author

Add this chron task. You can change @daily to @Weekly or @monthly

crontab -e

# Add this line to the end of the file, making sure to replace /path/python_virtualenv/ with your path

@daily PATH=/path/python_virtualenv/venv/bin/ ; /path/python_virtualenv/venv/bin/python /path/to/daily_status_codes.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment