Skip to content

Instantly share code, notes, and snippets.

@karlcow
Created August 7, 2012 02:20
Show Gist options
  • Save karlcow/3280765 to your computer and use it in GitHub Desktop.
Save karlcow/3280765 to your computer and use it in GitHub Desktop.
Encoding is tough stuff
>>> import cssutils
>>> cssutils.parseUrl("http://m.vk.com/css/s_mb.css?177")
WARNING 'charmap' codec can't decode byte 0x98 in position 810: character maps to <undefined>
>>> import requests
>>> r = requests.get("http://m.vk.com/css/s_mb.css?177")
>>> r.headers
{'content-encoding': 'gzip', 'transfer-encoding': 'chunked', 'expires': 'Tue, 14 Aug 2012 02:14:05 GMT', 'vary': 'Host,Accept-Encoding', 'server': 'nginx/1.2.1', 'connection': 'keep-alive', 'pragma': 'no-cache', 'cache-control': 'max-age=604800', 'date': 'Tue, 07 Aug 2012 02:14:05 GMT', 'x-powered-by': 'PHP/5.3.3-7+squeeze3', 'content-type': 'text/css; charset=windows-1251'}
>>> r.headers['content-type']
'text/css; charset=windows-1251'
@karlcow
Copy link
Author

karlcow commented Aug 7, 2012

I guess I have to change strategy and always use requests for the HTTP part, then decode the string. That might even make the code simpler.

>>> import requests
>>> import cssutils
>>> r = requests.get("http://m.vk.com/css/s_mb.css?177") 
>>> foo = cssutils.parseString(r.text)
WARNING Property: Unknown Property name. [1:297: -moz-box-shadow]
WARNING Property: Unknown Property name. [1:318: -webkit-box-shadow]
WARNING Property: Unknown Property name. [1:395: -webkit-appearance]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment