Created
June 2, 2021 00:49
-
-
Save mckern/a5de986e75a339d514a11fe686e8a01e to your computer and use it in GitHub Desktop.
Parse each line of `key=value` pairs into an array of separate `key=value` pairs in Ruby
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# frozen_string_literal: true | |
# Parse each line of key=value pairs into an array of separate key=value pairs. | |
# Some values will be in double quotes and contain white space, | |
# and some will not. All values with whitespace in them will be double quoted. | |
# No keys will be double quoted. | |
# annotated breakdown and explanation derived from | |
# https://stackoverflow.com/a/23613257, originally written by user | |
# https://stackoverflow.com/users/2195474/jbr in 2014 | |
# changes: | |
# - use post-Ruby 2.3 "frozen string" conventions & idioms | |
# - passes default rubocop scans | |
example_str = 'timestamp="Wed Jun 19 09:35:36 PDT 2019" message="test 10" id=10 field_1="test 10" field_6="test 1"' | |
def key_value_parser(str) | |
# assume a default state of "unquoted" | |
state = :unquoted | |
# initialize a constructor | |
result = [''] | |
# walk the characters in the string, one by one | |
str.chars do |char| | |
# while the state is unquoted, iterate characters until we find whitespace | |
case state | |
when :unquoted | |
case char | |
when /\s/ | |
# whitespace has been encountered, initialize a new empty string | |
# to begin appending characters to | |
result << '' | |
# if we encounter an opening double quote, change state to quoted and | |
# then append the double quote to the string. We don't care about | |
# preserving the double quote character itself. | |
when '"' | |
state = :quoted | |
else | |
# append the character to the end of the last value in our constructor. | |
result[-1] = "#{result[-1]}#{char}" | |
end | |
# while state is quoted, all whitespace is treated as literal and appended | |
# to the end of the last element in the array like any other character. | |
when :quoted | |
case char | |
# a closing double quote has been encountered and our current value is | |
# complete. initialize a new empty string to append the next iteration to | |
# and reset state. | |
when '"' | |
result << '' | |
state = :unquoted | |
else | |
# append the character to the end of the last value in our constructor. | |
result[-1] = "#{result[-1]}#{char}" | |
end | |
end | |
end | |
# ensure that the empty string initialized when the last delimiter | |
# was encountered is removed before the resulting array is returned. | |
result.reject(&:empty?) | |
end | |
puts key_value_parser(example_str).inspect |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
When an interviewer asks you to parse a bunch of
key=value
log lines into something else and you frantically google "ruby parse key value pairs whitespace", I hope that this helps you out.