Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
Convert list of german last names from html to plain text
# Using htmlparser we will convert an extensive list of german last names from html to plain text
# Digitales Familiennamenwörterbuch Deutschlands (DFD) is available from https://www.namenforschung.net
#
# depends on htmlparser go package
# https://github.com/htmlparser/htmlparser
# single-page view: currently 46035 names (May 2021)
curl -s 'https://www.namenforschung.net/dfd/woerterbuch/gesamtliste-veroeffentlichter-namenartikel/' |
htmlparser '#maincontent > ul:nth-child(even) > li > a text{}' > dfd.txt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment