Skip to content

Instantly share code, notes, and snippets.

@janx
Last active January 28, 2023 15:36
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save janx/5566686 to your computer and use it in GitHub Desktop.
Save janx/5566686 to your computer and use it in GitHub Desktop.
Memory leak research on OpenStruct and #define_method.

OpenStruct and #define_method

#define_method will create a closure, without careful use it could cause memory leak, as pointed out here.

OpenStruct use #define_method to create accessors dynamically, we suspect that will cause memory leak.

So I create two scripts to prove it, however the result shows the opposite: OpenStruct doesn't leak memory.

open_struct_leak.rb

Here I create OpenStruct repeatedly with big_hash, a hash with 5000 key-value pairs and each key/value is a very long string. Run the script ruby open_struct_leak.rb will print memory stats diff between each loop:

ruby open_struct_leak.rb 
     TOTAL      46613     T_HASH         -3   T_STRING      16838         GC         32
     TOTAL      31073     T_HASH          2   T_STRING          1         GC         22
     TOTAL       2454     T_HASH          0   T_STRING          0         GC         22
     TOTAL       2045     T_HASH          0   T_STRING          0         GC         22
     TOTAL       1636     T_HASH          0   T_STRING          0         GC         22
     TOTAL       1225     T_HASH          0   T_STRING          0         GC         22
     TOTAL       1227     T_HASH          0   T_STRING          0         GC         22
     TOTAL        818     T_HASH          0   T_STRING          0         GC         22
     TOTAL        818     T_HASH          0   T_STRING          0         GC         22
     TOTAL        818     T_HASH          0   T_STRING          0         GC         22
     TOTAL        409     T_HASH          0   T_STRING          0         GC         22
     TOTAL        409     T_HASH          0   T_STRING          0         GC         22
     TOTAL        409     T_HASH          0   T_STRING          0         GC         22
     TOTAL        409     T_HASH          0   T_STRING          0         GC         22
     TOTAL        409     T_HASH          0   T_STRING          0         GC         22
     TOTAL        409     T_HASH          0   T_STRING          0         GC         22
     TOTAL        409     T_HASH          0   T_STRING          0         GC         22
     TOTAL        409     T_HASH          0   T_STRING          0         GC         22
     TOTAL          0     T_HASH          0   T_STRING          0         GC         21
     TOTAL          0     T_HASH          0   T_STRING          0         GC         21
     TOTAL          0     T_HASH          0   T_STRING          0         GC         21
     TOTAL          0     T_HASH          0   T_STRING          0         GC         21

After serveral loop the diff of TOTAL/hash/string stablize at 0, thus no object leak. The hash used to initialize OpenStruct is garbage collected correctly.

Comment lines below line A and uncomment code below line B to see a comparative result:

ruby open_struct_leak.rb 
     TOTAL       6538     T_HASH         -2   T_STRING      11843         GC         10
     TOTAL      10220     T_HASH          3   T_STRING      10001         GC          9
     TOTAL       9811     T_HASH          1   T_STRING      10000         GC          9
     TOTAL      10219     T_HASH          1   T_STRING      10000         GC          9
     TOTAL       9809     T_HASH          1   T_STRING      10000         GC          9
     TOTAL      10224     T_HASH          1   T_STRING      10000         GC          9
     TOTAL       9809     T_HASH          1   T_STRING      10000         GC          9
     TOTAL      10221     T_HASH          1   T_STRING      10000         GC          9
     TOTAL       9814     T_HASH          1   T_STRING      10000         GC          9
     TOTAL      10218     T_HASH          1   T_STRING      10000         GC          9
     TOTAL       9814     T_HASH          1   T_STRING      10000         GC          9
     TOTAL      10224     T_HASH          1   T_STRING      10000         GC          9
     TOTAL       9816     T_HASH          1   T_STRING      10000         GC          9
     TOTAL       9810     T_HASH          1   T_STRING      10000         GC          9
     TOTAL      10216     T_HASH          1   T_STRING      10000         GC          9

There're 1 hash and 10000 strings created in each loop, because there's always reference on the hash created, the hash cannot be GCed.

Why doesn't #define_method in OpenStruct leak memory? Check its source code:

  def new_ostruct_member(name)
    name = name.to_sym
    unless self.respond_to?(name)
      class << self; self; end.class_eval do
        define_method(name) { @table[name] }
        define_method("#{name}=") { |x| modifiable[name] = x }
      end
    end
    name
  end

#define_method is called on OpenStruct object's singleton class. Once the object is GCed, its singleton class and singleton methods is also GCed too. We can validate this with another script.

define_method_closure.rb

The script includes two test subjects:

DefineMethodOnSingleton will define a new method on every object's singleton class when initialize.

DefineMethodOnBase will define a new method on base class when initialize.

Results:

DefineMethodOnSingleton (no leakage, the one T_STRING created every loop is the method name)
     TOTAL      17168     T_HASH          3   T_STRING       9787         GC          9
     TOTAL       9402     T_HASH          1   T_STRING          2         GC          9
     TOTAL        818     T_HASH          1   T_STRING          1         GC          9
     TOTAL        409     T_HASH          0   T_STRING          1         GC          9
     TOTAL        409     T_HASH          0   T_STRING          1         GC          9
     TOTAL        409     T_HASH          0   T_STRING          1         GC          9
     TOTAL        409     T_HASH          0   T_STRING          1         GC          9
     TOTAL          0     T_HASH          0   T_STRING          1         GC          9
     TOTAL          0     T_HASH          0   T_STRING          1         GC          9
     TOTAL          0     T_HASH          0   T_STRING          1         GC          9
     TOTAL          0     T_HASH          0   T_STRING          1         GC          9
     TOTAL          0     T_HASH          0   T_STRING          1         GC          9
     TOTAL          0     T_HASH          0   T_STRING          1         GC          9
DefineMethodOnBase (leaking with a big hash every loop)
     TOTAL      17169     T_HASH          3   T_STRING       9787         GC          9
     TOTAL       9813     T_HASH          2   T_STRING      10002         GC          9
     TOTAL       9812     T_HASH          2   T_STRING      10001         GC          9
     TOTAL      10219     T_HASH          1   T_STRING      10001         GC          9
     TOTAL       9810     T_HASH          1   T_STRING      10001         GC          9
     TOTAL      10222     T_HASH          1   T_STRING      10001         GC          9
     TOTAL       9810     T_HASH          1   T_STRING      10001         GC          9
     TOTAL       9809     T_HASH          1   T_STRING      10001         GC          9
     TOTAL      10224     T_HASH          1   T_STRING      10001         GC          9
     TOTAL       9813     T_HASH          1   T_STRING      10001         GC          9
     TOTAL      10216     T_HASH          1   T_STRING      10001         GC          9
     TOTAL       9811     T_HASH          1   T_STRING      10001         GC          9
     TOTAL      10216     T_HASH          1   T_STRING      10001         GC          9
     TOTAL       9814     T_HASH          1   T_STRING      10001         GC          9
     TOTAL      10218     T_HASH          1   T_STRING      10001         GC          9
     TOTAL       9805     T_HASH          1   T_STRING      10001         GC          9

def big_hash(merge={})
Hash[ *((1..10_000).map{|i| "str#{i}"*1000}) ].merge(merge)
end
class DefineMethodOnSingleton
def initialize
class << self; self; end.class_eval do
x = big_hash
define_method("foo#{Time.now.to_f}") {}
end
end
end
class DefineMethodOnBase
BIG_HASH = big_hash
def initialize
DefineMethodOnBase.class_eval do
x = big_hash
define_method("foo#{Time.now.to_f}") {}
end
end
end
old = ObjectSpace.count_objects
gc_old = GC.stat
loop do
#DefineMethodOnSingleton.new
DefineMethodOnBase.new
GC.start
counts = ObjectSpace.count_objects
gc = GC.stat
printf "%10s %10d %10s %10d %10s %10d %10s %10d\n", "TOTAL", counts[:TOTAL]-old[:TOTAL], "T_HASH", counts[:T_HASH]-old[:T_HASH], "T_STRING", counts[:T_STRING]-old[:T_STRING], "GC", gc[:count]-gc_old[:count]
old = counts
gc_old = gc
end
require 'ostruct'
old = ObjectSpace.count_objects
gc_old = GC.stat
last_hash = nil
def big_hash(merge={})
Hash[ *((1..10_000).map{|i| "str#{i}"*1000}) ].merge(merge)
end
loop do
# A. OpenStruct, no leak
h = big_hash
o = OpenStruct.new(h)
h.each do |k, v|
o.send k
end
# B. Create a new hash, link with last_hash so last_hash cannot be GCed, leak quickly
#last_hash = big_hash.merge(:last_hash => last_hash)
GC.start
counts = ObjectSpace.count_objects
gc = GC.stat
printf "%10s %10d %10s %10d %10s %10d %10s %10d\n", "TOTAL", counts[:TOTAL]-old[:TOTAL], "T_HASH", counts[:T_HASH]-old[:T_HASH], "T_STRING", counts[:T_STRING]-old[:T_STRING], "GC", gc[:count]-gc_old[:count]
old = counts
gc_old = gc
end
@jmromer
Copy link

jmromer commented Jan 28, 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment