This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #! /bin/bash | |
| # Generate a file per page | |
| for i in {1..448}; do pdftotext -f ${i} -l ${i} mueller-report-searchable.pdf mueller_page_${i}.txt; done | |
| # Double interban quotes | |
| for i in {1..448}; do perl -pi -e 's/\"/\"\"/g' mueller_page_${i}.txt; done | |
| # Set CSV header | |
| echo "page","text" > mueller_pages.csv | |
| # Generate CSV rows, including page number and page text enclosed by quotes | |
| for i in {1..448}; do echo ${i}',"'`cat mueller_page_${i}.txt`'"' >> mueller_pages.csv; done |
We can't make this file beautiful and searchable because it's too large.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| "Business ID","Business name","Address","City","State","Postal code","Latitude","Longitude","Phone number","Inspection","Inspection score","Score type","Inspection date","Inspection type","Violation description" | |
| "114","GOOD MONG KOK","1039 STOCKTON ST ","San Francisco","CA","94108","37.795594","-122.408204","","Yes","65","Poor","2012-07-13","routine","Unclean or unsanitary food contact surfaces" | |
| "114","GOOD MONG KOK","1039 STOCKTON ST ","San Francisco","CA","94108","37.795594","-122.408204","","Yes","65","Poor","2012-07-13","routine","Unclean or degraded floors walls or ceilings" | |
| "114","GOOD MONG KOK","1039 STOCKTON ST ","San Francisco","CA","94108","37.795594","-122.408204","","Yes","65","Poor","2012-07-13","routine","Unapproved or unmaintained equipment or utensils" | |
| "114","GOOD MONG KOK","1039 STOCKTON ST ","San Francisco","CA","94108","37.795594","-122.408204","","Yes","65","Poor","2012-07-13","routine","Unclean hands or improper use of gloves" | |
| "114","GOOD MONG KOK","1039 STOCKTON ST ","San Francisco","CA" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| SET LINESIZE 1000 | |
| SET PAGESIZE 9999 | |
| SET NUMWIDTH 20 | |
| SET TRIMSPOOL ON | |
| SET TRIMOUT ON | |
| SET VERIFY OFF | |
| SET SERVEROUTPUT ON | |
| SET UNDERLINE OFF | |
| SET FEEDBACK OFF | |
| SET HEAD OFF |