Created
April 5, 2009 15:30
-
-
Save joewest/90458 to your computer and use it in GitHub Desktop.
Deobfuscate Hulu ajax responses in Python
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
I'm a python port of Hulu's javascript for deobfuscating certain ajax responses. | |
As of 2009-April-05, I've been tested on Python 2.5.2. | |
A usage example that relies on BeautifulSoup (http://bit.ly/3Ks24u): | |
import urllib | |
from BeautifulSoup import BeautifulSoup | |
url = "http://www.hulu.com/channels/Drama/?kind=videos&sort=rating&type=tv" | |
text = urllib.urlopen(url).read() | |
soup = BeautifulSoup(_dobfu(text)) | |
for inc in soup(attrs={"class":"show-title-container"}): | |
print "".join([inc.a.contents[0],inc.a.span.contents[0]]) | |
""" | |
import math, re, urllib | |
# srzf is bitwise shift right (zero fill) | |
# It shifts n right by b bits and is equiv to >>> in js | |
def srzf(n, b): | |
return (n & 0xFFFFFFFFL) >> b | |
# tol returns a list of numeric longs; maybe tol means "to long" | |
def tol(s): | |
return map(lambda i: ord(s[i*4]) + (ord(s[i*4+1]) << 8) + (ord(s[i*4+2]) << 16) + (ord(s[i*4+3]) << 24), range(0,math.ceil(len(s) / 4 -1))) | |
# _dobfu deobfuscates ajax responses for various pages from Hulu | |
# And it returns a single string of html, js, css | |
def _dobfu(text): | |
return re.sub('\\0+$','',"".join(map(lambda i: (lambda i = 0xfeedface ^ i: chr(i & 0xFF) + chr(srzf(i,8) & 0xFF) + chr(srzf(i,16) & 0xFF) + chr(srzf(i,24) & 0xFF)) (),tol(urllib.unquote(text[7:]))))) if text[0:7] == "dobfu__" else text | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment