Skip to content

Instantly share code, notes, and snippets.

@cuixin
Created March 14, 2013 07:39
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 3 You must be signed in to fork a gist
  • Save cuixin/5159547 to your computer and use it in GitHub Desktop.
Save cuixin/5159547 to your computer and use it in GitHub Desktop.
Using lua to parse CSV file to a table.
-- Using lua to parse CSV file to a table.
-- Notice: first line must be data description filed.
-- The separator is '|', change it if you want.
-- Usage: csv = require('csv')
-- tab = csv.load('test.csv', ',')
-- table.foreach(tab[1], print)
-- print(tab[1].you_field)
--encoding=utf-8
local error = error
local setmetatable = setmetatable
local lines = io.lines
local insert = table.insert
local ipairs = ipairs
local string = string
module(...)
string.split = function (str, pattern)
pattern = pattern or "[^%s]+"
if pattern:len() == 0 then pattern = "[^%s]+" end
local parts = {__index = insert}
setmetatable(parts, parts)
str:gsub(pattern, parts)
setmetatable(parts, nil)
parts.__index = nil
return parts
end
local function parse_title(title, sep)
local desc = title:split("[^" .. sep .. "]+")
local class_mt = {}
for k, v in ipairs(desc) do
class_mt[v] = k
end
return class_mt
end
local function parse_line(mt, line, sep)
local data = line:split("[^" .. sep .. "]+")
setmetatable(data, mt)
return data
end
function load(path, sep)
local tag, sep, mt, data = false, sep or '|', nil, {}
for line in lines(path) do
if not tag then
tag = true
mt = parse_title(line, sep)
mt.__index = function(t, k) if mt[k] then return t[mt[k]] else return nil end end
mt.__newindex = function(t, k, v) error('attempt to write to undeclare variable "' .. k .. '"') end
else
insert(data, parse_line(mt, line, sep))
end
end
return data
end
local class_mt = {
__newindex = function(t, k, v)
error('attempt to write to undeclare variable "' .. k .. '"')
end
}
setmetatable(_M, class_mt)
@cool8jay
Copy link

the split function has some bugs:

  1. the last element in the line has a "line ending" symbol. it is invisible, should trim the input string first.

  2. it can not handle empty string -- an empty cell in csv file. The following code can fix this:

    function string.split(str, delimiter)
        if (delimiter=='') then return false end
        local pos,array = 0, {}
        -- for each divider found
        for st,sp in function() return string.find(str, delimiter, pos, true) end do
            table.insert(array, string.sub(str, pos, st - 1))
            pos = sp + 1
        end
        table.insert(array, string.sub(str, pos))
        return array
    end
    

and the parse_line() does not handle comma and quote, I use another solution:
http://lua-users.org/wiki/LuaCsv

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment