Last active
December 15, 2015 17:39
-
-
Save selfboot/5298362 to your computer and use it in GitHub Desktop.
python 编码检测示例程序.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
>>> import requests | |
>>> r = requests.get('http://www.luoo.net/radio/radio2/mp3player.xml') | |
>>> r.status_code | |
200 | |
>>> print r.content | |
锘??xml version="1.0" encoding="UTF-8"?> | |
<player showDisplay="yes" showPlaylist="yes" autoStart="yes"> | |
<song path="http://ftp.luoo.net/radio/radio2/1.mp3" title="鏃呰€? /> | |
<song path="http://ftp.luoo.net/radio/radio2/2.mp3" title="绱㈤潪浜? /> | |
<song path="http://ftp.luoo.net/radio/radio2/3.mp3" title="涓夊嘲" /> | |
<song path="http://ftp.luoo.net/radio/radio2/4.mp3" title="閬ヤ笉鍙強" /> | |
... | |
>>> import chardet | |
>>> chardet.detect(r.content) | |
{'confidence': 1.0, 'encoding': 'UTF-8'} | |
>>> print r.content.decode('utf8').encode('gbk') | |
Traceback (most recent call last): | |
File "<stdin>", line 1, in <module> | |
UnicodeEncodeError: 'gbk' codec can't encode character u'\ufeff' in position 0: | |
illegal multibyte sequence | |
>>> print r.content.decode('utf-8-sig').encode('gbk') | |
<?xml version="1.0" encoding="UTF-8"?> | |
<player showDisplay="yes" showPlaylist="yes" autoStart="yes"> | |
<song path="http://ftp.luoo.net/radio/radio2/1.mp3" title="旅者" /> | |
<song path="http://ftp.luoo.net/radio/radio2/2.mp3" title="索非亚" /> | |
<song path="http://ftp.luoo.net/radio/radio2/3.mp3" title="三峰" /> | |
<song path="http://ftp.luoo.net/radio/radio2/4.mp3" title="遥不可及" /> |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment