Skip to content

Instantly share code, notes, and snippets.

@g3d
Last active August 29, 2015 14:17
Show Gist options
  • Save g3d/6ce15fb55ec6db8a60c3 to your computer and use it in GitHub Desktop.
Save g3d/6ce15fb55ec6db8a60c3 to your computer and use it in GitHub Desktop.
xlsx parsing in ruby
XLSX file, 1.5 Mb, 10 000 rows
==================================
Benchmark.measure { ::Creek::Book.new(import.import_file.read.path).sheets.first.rows.first }
<Benchmark::Tms:0x0000010eb1b200
@cstime=0.0,
@cutime=0.0,
@label="",
@real=83.360489,
@stime=0.9699999999999998,
@total=78.46,
@utime=77.49>
Benchmark.measure { ::Saxlsx::Workbook.open(import.import_file.read.path) {|w| w.sheets.first.rows.first } }
<Benchmark::Tms:0x00000101176608
@cstime=0.0,
@cutime=0.0,
@label="",
@real=0.551811,
@stime=0.07000000000000028,
@total=0.529999999999994,
@utime=0.45999999999999375>
Benchmark.measure { ::SimpleXlsxReader.open(import.import_file.read.path).sheets.first.rows.first }
<Benchmark::Tms:0x00000130a20360
@cstime=0.0,
@cutime=0.0,
@label="",
@real=51.066787,
@stime=1.4100000000000001,
@total=49.68000000000001,
@utime=48.27000000000001>
Benchmark.measure { ::Oxcelix::Workbook.new(import.import_file.read.path, :copymerge => false) }
NoMethodError: undefined method name=' for #<Matrix:0x0000014bea92b8>
from /Users/g3d/.rvm/gems/ruby-2.1.1@gemsetname/gems/oxcelix-0.4.0/lib/oxcelix/workbook.rb:277:in `block in matrixto'
Benchmark.measure { ::Roo::Excelx.new(import.import_file.read.path).sheets.first }
<Benchmark::Tms:0x000001052c94c0
@cstime=0.0,
@cutime=0.0,
@label="",
@real=68.535196,
@stime=0.9299999999999997,
@total=66.37,
@utime=65.44>
Benchmark.measure { ::Spreadsheet.open(import.import_file.read.path).worksheet(0).row(1) } # xlsx not support
Ole::Storage::FormatError: OLE2 signature is invalid
from /Users/g3d/.rvm/gems/ruby-2.1.1@isis/gems/ruby-ole-1.2.11.7/lib/ole/storage/base.rb:378:in validate!
Benchmark.measure { ::RubyXL::Parser.parse(import.import_file.read.path).worksheets[0].sheet_data[0] }
<Benchmark::Tms:0x0000012a0c9f10
@cstime=0.0,
@cutime=0.0,
@label="",
@real=33.429753,
@stime=0.96,
@total=32.519999999999996,
@utime=31.559999999999995>
Benchmark.measure { ::Dullard::Workbook.new(import.import_file.read.path).sheets[0].rows[0] }
<Benchmark::Tms:0x000001136fd600
@cstime=0.0,
@cutime=0.0,
@label="",
@real=0.744242,
@stime=0.03999999999999915,
@total=0.6699999999999999,
@utime=0.6300000000000008>
=================================
Conclusion: dullard & saxlsx show best results, but:
- Saxlsx return Saxlsx::RowsCollection for rows
- Dullard return Enumerator for rows (return not correct string data for rows)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment