Skip to content

Instantly share code, notes, and snippets.

@NEJmark
Last active November 30, 2017 09:16
Show Gist options
  • Save NEJmark/e5306e085b838d3aaa49513a92e0f226 to your computer and use it in GitHub Desktop.
Save NEJmark/e5306e085b838d3aaa49513a92e0f226 to your computer and use it in GitHub Desktop.
# -*- coding: utf-8 -*-
import urllib
import lxml.etree as etree
url='http://mops.twse.com.tw/server-java/t164sb01?step=1&CO_ID=2330&SYEAR=2013&SSEASON=1&REPORT_ID=C'
response = urllib.urlopen(url)
html= response.read()
page = etree.HTML(html.decode('cp950'))
item_to_print=[u'基本每股盈餘合計',u'營業收入合計']
for tr in page.xpath('(.//tr[@class="even"]|.//tr[@class="odd"])'):
if len(tr) > 1 and isinstance(tr[0].text,unicode) :
row_name = tr[0].text.strip()
if row_name in item_to_print:
print(u'{0}: {1}'.format(row_name, [td.text.strip() for td in tr[1:]]))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment