Created
January 10, 2013 11:11
-
-
Save tinnet/4501305 to your computer and use it in GitHub Desktop.
Small python (2.7) script to check .csv files full of urls for their current status code (for example to verify if you fixed the issues google webmaster tools is reporting)
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from __future__ import print_function | |
import argparse | |
import csv | |
import requests | |
import sys | |
_EPILOG = """ | |
Script takes a list of .csv files, tries to guess their format (seperator), | |
then checks for a field called 'URL', tries to fetch that url and prints | |
the response code back out (with the history of codes attached if there where | |
redirects).""" | |
def find_url(row): | |
if 'URL' in row: | |
return row['URL'] | |
if 'url' in row: | |
return row['url'] | |
if 'uri' in row: | |
return row['uri'] | |
if __name__ == '__main__': | |
parser = argparse.ArgumentParser(description='Checks urls from .csv files for their HTTP reponse codes', epilog=_EPILOG) | |
parser.add_argument('files', type=str, metavar='CSVFILE', nargs='+') | |
args = parser.parse_args() | |
for file in args.files: | |
with open(file, 'rU') as csvfile: | |
dialect = csv.Sniffer().sniff(csvfile.read(1024)) | |
csvfile.seek(0) | |
print('"URL";"STATUSCODE";"HISTORY"') | |
for row in csv.DictReader(csvfile,dialect=dialect): | |
r = requests.get(find_url(row)) | |
print('"{}";{};"{}"'.format(find_url(row), r.status_code, [h.status_code for h in r.history])) | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Small python (2.7) script to check .csv files full of urls for their current status code (for example to verify if you fixed the issues google webmaster tools is reporting)
Requirements
requests (http://docs.python-requests.org/en/latest/)
INPUT
Just an example, any file that python csv can read and that contains a 'URL' column is fine
OUTPUT