Skip to content

Instantly share code, notes, and snippets.

@ryanlecompte
Created August 7, 2012 04:16
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ryanlecompte/3281509 to your computer and use it in GitHub Desktop.
Save ryanlecompte/3281509 to your computer and use it in GitHub Desktop.
ActiveRecord memory leak with multiple threads?
# It appears that when I perform a query with AR via multiple threads,
# the instantiated objects do not get released when a GC is performed.
threads = Array.new(5) { Thread.new { Foo.where(:status => 2).all.first(100).each { |f| f.owner.first_name } } }
threads.each(&:join)
threads = nil
GC.start
ObjectSpace.each_object(Foo).count # => instances still exist
# ----------------
Foo.where(:status => 2).all.first(100).each { |f| f.owner.first_name }
GC.start
ObjectSpace.each_object(Foo).count # => 0
@peterc
Copy link

peterc commented Aug 7, 2012

See if there's a difference if you do threads.replace [] instead of threads = nil.

@mboeh
Copy link

mboeh commented Aug 7, 2012

My guess would be that 'threads' gets captured by the closure created for each thread (and for Array.new) and you end up having a circular, orphaned dependency -- the Foos don't get GC'd until the threads are, and the thread closures hold references to themselves.

Absolute conjecture, though. Try wrapping the thread creation in a method so there's no locals for the blocks to close over.

@tarcieri
Copy link

tarcieri commented Aug 7, 2012

I'm guessing you might need to run clear_stale_cached_connections!

@ryanlecompte
Copy link
Author

Yikes! I didn't get e-mailed by GitHub at all for these comments. Thanks guys! Let me try this out.

@ryanlecompte
Copy link
Author

Okay, unfortunately that didn't help. I tried both #clear_stale_cached_connections! and wrapping it in a separate method as such:

irb(main):001:0> def go
irb(main):002:1>   10.times.map do
irb(main):003:2*      Thread.new do
irb(main):004:3*        Foo.where(:status => 2).all.first(50).each { |e| e.owner.first_name }
irb(main):005:3>     end
irb(main):006:2>   end
irb(main):007:1> end
=> nil
irb(main):008:0> threads = go
=> [#<Thread:0x000000042ce710 run>, #<Thread:0x000000042ce580 run>, #<Thread:0x000000042ce1c0 run>, #<Thread:0x000000042ce008 run>, #<Thread:0x000000042cde00 run>, #<Thread:0x000000042d5bc8 run>, #<Thread:0x000000042d5b28 run>, #<Thread:0x00000003ac02a8 run>, #<Thread:0x00000003ac0438 run>, #<Thread:0x000000042c7f78 run>]
irb(main):009:0> threads.each(&:join)
=> [#<Thread:0x000000042ce710 dead>, #<Thread:0x000000042ce580 dead>, #<Thread:0x000000042ce1c0 dead>, #<Thread:0x000000042ce008 dead>, #<Thread:0x000000042cde00 dead>, #<Thread:0x000000042d5bc8 dead>, #<Thread:0x000000042d5b28 dead>, #<Thread:0x00000003ac02a8 dead>, #<Thread:0x00000003ac0438 dead>, #<Thread:0x000000042c7f78 dead>]
irb(main):010:0> GC.start
=> nil
irb(main):011:0> ObjectSpace.each_object(Foo).count
=> 132890
irb(main):012:0> 

irb(main):012:0> ActiveRecord::Base.connection_pool.clear_stale_cached_connections!
=> [35025800, 35025120, 35024900, 35024640, 35040740, 35040660, 30802260, 35012540, 30802460, 35025600]
irb(main):013:0> ObjectSpace.each_object(Foo).count
=> 132890
irb(main):014:0> 

This is using Rails 3.0.10 on MRI 1.9.3p194.

Can anyone else try this locally and see if they see the same results?

@mboeh
Copy link

mboeh commented Aug 7, 2012

Threads do retain a return value, which is whatever value the thread's proc returns. Try returning nil from the threads.

It wouldn't hurt to try to disable ActiveRecord's query cache with Foo.uncached, too.

@ryanlecompte
Copy link
Author

@mboeh, you win! I just tried returning nil as the last value of the Thread block, and that worked! Interesting! I thought that not having any references to the threads would cause them to get garbage collected (and their associated values).

@mboeh
Copy link

mboeh commented Aug 7, 2012

Well, your second example seems to keep references to the threads. And IRB does keep a lot of stuff. On the other hand, if Ruby is keeping references to dead threads indefinitely, that's a problem for sure.

@mboeh
Copy link

mboeh commented Aug 7, 2012

I did my own testing and determined that setting threads to nil is insufficient to get the threads GC'd. You need to do threads.clear.

This is true even if you wrap the code creating the threads in a method, which is surprising -- I'd expect the local variable reference to be lost outside that method's scope, and the threads to be available for GC.

It seems that threads stored in an array in a local variable might not be available to GC when expected. I have a demonstration at https://gist.github.com/3287930 .

@ryanlecompte
Copy link
Author

Thank you @mboeh. That's very interesting and very good to know!

@raggi
Copy link

raggi commented Aug 7, 2012

Things in ObjectSpace are not necessarily live instances. There are shortcut tricks you can use to force memory to be free'd / overwritten (in MRI only, where it's full of hacks at the C level "for speed"). Assuming that any Ruby GC will be fully deterministic around GC.start behavior is unlikely to be productive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment