Skip to content

Instantly share code, notes, and snippets.

@robhoare
Last active August 29, 2015 14:12
Show Gist options
  • Save robhoare/1184357542ddc0a7f4dd to your computer and use it in GitHub Desktop.
Save robhoare/1184357542ddc0a7f4dd to your computer and use it in GitHub Desktop.
Turn saved data from Scotlandspeople Valuation Rolls into tsv
grep -h '<td>' *.html | sed -e 's/<\/td>*//g' | sed -e 's/&nbsp;*//g'| sed -e 's/<td>*/\t/g' | sed -e 's/^[ \t]*//'
@robhoare
Copy link
Author

Run this is in a directory of "printer friendly" html files saved from Scotlandspeople. Output is a tsv from all the files. For the valuation rolls, the columns are number (of output line on report page, so not useful), Year, Status (of ownership/occupation), Title, Surname, Forename, Place, Parish, County/Burgh, Reference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment