Skip to content

Instantly share code, notes, and snippets.

@iinm
Created June 13, 2021 08:21
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save iinm/7425ca0bbf33ae6d3664f6da0e77b378 to your computer and use it in GitHub Desktop.
Save iinm/7425ca0bbf33ae6d3664f6da0e77b378 to your computer and use it in GitHub Desktop.
List countries
#!/usr/bin/env python
# coding: utf-8
# Usage:
# python countries.py | jq -R 'split("\t") | {code: .[4], name: .[0], nameEn: .[1], location: .[5]}' | jq -s .
import sys
import urllib.request
import lxml
import lxml.html
root = lxml.html.parse(urllib.request.urlopen("https://ja.wikipedia.org/wiki/ISO_3166-1"))
elements = root.xpath("//tr[contains(*, '国・地域名')]/following-sibling::tr")
for el in elements:
columns = el.xpath("td")
if len(columns) < 7:
print('warning: Skip row',
[lxml.etree.tostring(c, encoding="utf-8").decode()
for c in columns],
file=sys.stderr)
continue
texts = [c.text_content().strip() for c in columns]
print("\t".join(texts))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment