Skip to content

Instantly share code, notes, and snippets.

@raypereda
Created November 12, 2019 16:36
Show Gist options
  • Save raypereda/718573bde62d1d07200e54ede1df967b to your computer and use it in GitHub Desktop.
Save raypereda/718573bde62d1d07200e54ede1df967b to your computer and use it in GitHub Desktop.
escapes double-quotes in the last field of a CSV line
line = '1,2,3,4,5,"He said "thanks" to the cashier"'
p line
# "1,2,3,4,5,\"He said \"thanks\" to the cashier\""
parts = line.split(",", 6)
p parts
# ["1", "2", "3", "4", "5", "\"He said \"thanks\" to the cashier\""]
text = parts[5]
p text
"\"He said \"thanks\" to the cashier\""
text = text[1...-1]
p text
# "He said \"thanks\" to the cashier"
text.gsub!('"', '""')
p text
# "He said \"\"thanks\"\" to the cashier"
parts[5] = text
p parts
# ["1", "2", "3", "4", "5", "He said \"\"thanks\"\" to the cashier"]
line = parts.join(",")
p line
# "1,2,3,4,5,He said \"\"thanks\"\" to the cashier"
line = '"' + line + '"'
p line
# "\"1,2,3,4,5,He said \"\"thanks\"\" to the cashier\""
# escapes double-quotes in the last field of a CSV line
# assume only the last field is a string
# all others are non-string values
# n is the number of fields
def fix(line, n)
parts = line.split(",", n)
text = parts[-1]
text = text[1...-1] # strip outer double-quotes
text.gsub!('"', '""') # escape double-quotes
parts[-1] = text
'"' + parts.join(",") + '"' # add the outer double-quotes
end
line = '1,2,3,4,5,"He said "thanks" to the cashier"'
p fix(line, 6)
# "\"1,2,3,4,5,He said \"\"thanks\"\" to the cashier\""
@raypereda
Copy link
Author

In general, fixing CSV files has too much guesswork. This might work in the special case when the only string field is the last one.

We should explore how erroneous CSV is created. There make a flag or switch to turn on double-quote escaping.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment