Last active
August 29, 2015 14:11
-
-
Save dalethedeveloper/cc1f482f5ebdf07ccf26 to your computer and use it in GitHub Desktop.
Prepare a Tab Delimited flat file for SQL Import where text fields contain newlines that cause records fall to the next line
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Assumes a tab delimited flat file and the first field being a numeric key | |
BEGIN { | |
FS = OFS = "\t"; | |
p = ""; | |
} | |
# Strategy is to readline, check for key in first field, store to print | |
# on next readline if the next record has a key in first field, otherwise | |
# append our fragmented line to the stored line | |
{ | |
sub(/\r/,""); # also scrub those pesky carriage returns | |
if( $1 ~ /^[0-9]+$/ ) { | |
if( p != "" ) { | |
sub(/^\s+/,"",p); | |
print p; | |
p = ""; | |
} | |
p = $0 | |
} else { | |
p = p " " $0; | |
} | |
} | |
END { | |
print p | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment