Last active
December 17, 2015 10:48
-
-
Save mmmries/5596989 to your computer and use it in GitHub Desktop.
Tab-separated Value Reader
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
module TSV | |
class LineReader | |
attr_reader :filepath | |
def initialize(filepath) | |
@filepath = filepath | |
end | |
def each | |
buffer = "" | |
open(filepath) do |f| | |
while chunk = f.read(4096) | |
buffer << chunk | |
lines = buffer.split(/[\r\n]+/) | |
buffer = lines.pop | |
lines.each do |line| | |
yield line | |
end | |
end | |
yield buffer unless buffer.empty? | |
end | |
end | |
end | |
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
module TSV | |
class LineReader | |
attr_reader :filepath | |
def initialize(filepath) | |
@filepath = filepath | |
end | |
def each | |
buffer = "" | |
open(filepath) do |f| | |
chunk = "" | |
while chunk = f.read(4096, chunk) | |
buffer << chunk | |
lines = buffer.split(/[\r\n]+/) | |
buffer = lines.pop | |
lines.each do |line| | |
yield line | |
end | |
end | |
yield buffer unless buffer.empty? | |
end | |
end | |
end | |
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
class StrictTsv | |
include Enumerable | |
attr_reader :filepath | |
def initialize(filepath) | |
@filepath = filepath | |
end | |
def each | |
headers = nil | |
row_enumerable.each do |line| | |
unless headers | |
headers = line.strip.split "\t" | |
next | |
end | |
next if line.empty? | |
fields = Hash[headers.zip(line.split "\t")] | |
yield fields | |
end | |
end | |
private | |
def row_enumerable | |
TSV::LineReader.new(filepath) | |
end | |
end |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment