Skip to content

Instantly share code, notes, and snippets.

@groteck
Last active January 19, 2021 03:31
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save groteck/7c24539a4ed78e0ea503 to your computer and use it in GitHub Desktop.
Save groteck/7c24539a4ed78e0ea503 to your computer and use it in GitHub Desktop.
Parse CSV file for colons or semicolons
# CSV.parse(csv_header, col_sep: ColSepSniffer.find(csv_header))
# returns a CSV::Table object
p CSV.parse(csv_file,
headers: true,
col_sep: ColSepSniffer.find(csv_headers)).map(&:to_hash)
#!/usr/bin/env ruby
# encoding: utf-8
require "csv"
class ColSepSniffer
NoColumnSeparatorFound = Class.new(StandardError)
EmptyString = Class.new(StandardError)
COMMON_DELIMITERS = [
'","',
'";"'
].freeze
def initialize(string)
@string = string
end
def self.find(string)
new(string).find
end
def find
fail EmptyString unless @string
if valid?
delimiters[0][0][1]
else
fail NoColumnSeparatorFound
end
end
private
def valid?
!delimiters.collect(&:last).reduce(:+).zero?
end
def delimiters
@delimiters ||= COMMON_DELIMITERS.inject({}, &count).sort(&most_found)
end
def most_found
->(a, b) { b[1] <=> a[1] }
end
def count
->(hash, delimiter) { hash[delimiter] = @string.count(delimiter); hash }
end
end
csv_file = "ID;Name;Country\n1;Perico;España\n2;Björn;Germany\n"
csv_headers = csv_file.lines.first
# CSV.parse(csv_header, col_sep: ColSepSniffer.find(csv_header))
# returns a CSV::Table object
p CSV.parse(csv_file,
headers: true,
col_sep: ColSepSniffer.find(csv_headers)).map(&:to_h)
# => [{"ID"=>"1", "Name"=>"Perico", "Country"=>"España"},
# {"ID"=>"2", "Name"=>"Björn", "Country"=>"Germany"}]
@askareija
Copy link

I've got errror unterminated string meets end of file

@groteck
Copy link
Author

groteck commented Aug 7, 2020

This script have 5 years not sure if it works with the new ruby versions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment