Skip to content

Instantly share code, notes, and snippets.

@CrookedNumber
Created December 17, 2009 22:08
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save CrookedNumber/259069 to your computer and use it in GitHub Desktop.
Save CrookedNumber/259069 to your computer and use it in GitHub Desktop.
Python script to bulk validate a drupal site
#!/usr/bin/env python
import urllib
import time
import httplib
base = 'www.edc.org' # NO trailing slash
lim = 200 # highest nid in your {node} table
def xhmtl_is_valid(url):
time.sleep(.75) # seconds
conn = httplib.HTTPConnection("validator.w3.org")
req = u"/check?" + urllib.urlencode([('uri', url), ('verbose', 1)])
conn.request("HEAD", req)
res = conn.getresponse()
return res.getheader('x-w3c-validator-status') == 'Valid'
count = 1
while (count <= lim):
path = '/node/' + str(count)
conn = httplib.HTTPConnection(base)
conn.request("HEAD", path)
res = conn.getresponse()
if res.status == 200 or res.status == 301:
if xhmtl_is_valid('http://' + base + path):
print path + ' is OK'
else:
print 'issue with ' + path
else:
print 'No 200 or 301 status for ' + path
count = count +1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment