Skip to content

Instantly share code, notes, and snippets.

@janklimo
Last active February 22, 2024 08:59
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save janklimo/240cf6b4517941f52809343631daff77 to your computer and use it in GitHub Desktop.
Save janklimo/240cf6b4517941f52809343631daff77 to your computer and use it in GitHub Desktop.
Comparison of memory usage: AXLSX vs. rubyXL
# frozen_string_literal: true
require 'axlsx'
require 'rubyXL'
require 'rubyXL/convenience_methods/worksheet'
require 'memory_profiler'
rows = 1_000
columns = 20
report = MemoryProfiler.report do
# axlsx
Axlsx::Package.new do |p|
p.workbook.add_worksheet(name: 'Test') do |sheet|
rows.times do
sheet.add_row ['test data'] * columns
end
end
p.serialize('tmp/test_axlsx.xlsx')
end
# rubyXL
workbook = RubyXL::Workbook.new
worksheet = workbook.worksheets[0]
worksheet.sheet_name = 'Test'
rows.times do |r|
worksheet.insert_row(r)
columns.times do |c|
worksheet.insert_cell(r, c, 'test data')
end
end
workbook.write('tmp/test_rubyxl.xlsx')
end
report.pretty_print
# rows = 100
# columns = 20
#
# retained memory by gem
# -----------------------------------
# 119888 axlsx-3.0.0.pre
# 5384 rubyXL-3.3.30
# 1527 pathname
# 195 rubyzip-1.2.1
# 80 htmlentities-4.3.4
# 40 nokogiri-1.8.4
# 40 other
# 40 singleton
#
# ===================================
#
# rows = 1_000
# columns = 20
#
# retained memory by gem
# -----------------------------------
# 1102664 axlsx-3.0.0.pre
# 5384 rubyXL-3.3.30
# 1527 pathname
# 195 rubyzip-1.2.1
# 80 htmlentities-4.3.4
# 40 nokogiri-1.8.4
# 40 other
# 40 singleton
@janklimo
Copy link
Author

janklimo commented Feb 17, 2024

Rerun with newer versions and using caxlsx.

rows = 100
columns = 20
retained memory by gem
-----------------------------------
     19917  caxlsx-4.0.0
      4880  rubyXL-3.4.25
      1540  lib
        80  htmlentities-4.3.4
        40  nokogiri-1.16.2-arm64-darwin
rows = 1_000
columns = 20
retained memory by gem
-----------------------------------
     81261  caxlsx-4.0.0
      4880  rubyXL-3.4.25
      1540  lib
        80  htmlentities-4.3.4
        40  nokogiri-1.16.2-arm64-darwin

I tried with 100,000 rows but it reached 32GB of RAM used before I stopped the process.

So the memory retained by rubyXL is constant but there is a tradeoff:

100 rows

allocated memory by gem
-----------------------------------
   7535744  rubyXL-3.4.25
   4526648  nokogiri-1.16.2-arm64-darwin
    387563  caxlsx-4.0.0
    194548  lib
    162946  rubyzip-2.3.2
     65264  other
     50346  htmlentities-4.3.4

1,000 rows

allocated memory by gem
-----------------------------------
  70415800  rubyXL-3.4.25
  40968455  nokogiri-1.16.2-arm64-darwin
   2504483  caxlsx-4.0.0
    591026  rubyzip-2.3.2
    389264  other
    194548  lib
     50346  htmlentities-4.3.4

rubyXL allocates significantly more memory than caxlsx. While it doesn't retain it, it will require more RAM to run :/ Time for a new approach.

fast_excel gem

# frozen_string_literal: true

require 'fast_excel'

require 'memory_profiler'

rows = 1_000_000
columns = 20

report = MemoryProfiler.report do
  workbook = FastExcel.open('tmp/test_fast_excel.xlsx', constant_memory: true)
  worksheet = workbook.add_worksheet('Test')

  rows.times do
    worksheet << (['test data'] * columns)
  end
  workbook.close
end

report.pretty_print

1,000 rows

allocated memory by gem
-----------------------------------
    360000  other
      1384  fast_excel-0.5.0
       224  ffi-1.16.3
       208  lib

1,000,000 rows

allocated memory by gem
-----------------------------------
 360000000  other
      1384  fast_excel-0.5.0
       224  ffi-1.16.3
       208  lib

Finished in 26s without missing a beat ⚡ There was no retained memory and ruby's process peaked at 800MB on my machine. Compare that to 32GB of memory used by the other two gems before I terminated the benchmark. Promising 🤞

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment