Skip to content

Instantly share code, notes, and snippets.

@mmmries
Last active December 17, 2015 10:48
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mmmries/5596989 to your computer and use it in GitHub Desktop.
Save mmmries/5596989 to your computer and use it in GitHub Desktop.
Tab-separated Value Reader
module TSV
class LineReader
attr_reader :filepath
def initialize(filepath)
@filepath = filepath
end
def each
buffer = ""
open(filepath) do |f|
while chunk = f.read(4096)
buffer << chunk
lines = buffer.split(/[\r\n]+/)
buffer = lines.pop
lines.each do |line|
yield line
end
end
yield buffer unless buffer.empty?
end
end
end
end
module TSV
class LineReader
attr_reader :filepath
def initialize(filepath)
@filepath = filepath
end
def each
buffer = ""
open(filepath) do |f|
chunk = ""
while chunk = f.read(4096, chunk)
buffer << chunk
lines = buffer.split(/[\r\n]+/)
buffer = lines.pop
lines.each do |line|
yield line
end
end
yield buffer unless buffer.empty?
end
end
end
end
class StrictTsv
include Enumerable
attr_reader :filepath
def initialize(filepath)
@filepath = filepath
end
def each
headers = nil
row_enumerable.each do |line|
unless headers
headers = line.strip.split "\t"
next
end
next if line.empty?
fields = Hash[headers.zip(line.split "\t")]
yield fields
end
end
private
def row_enumerable
TSV::LineReader.new(filepath)
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment