Skip to content

Instantly share code, notes, and snippets.

@spastorino
Created November 1, 2011 02:00
Show Gist options
  • Save spastorino/1329640 to your computer and use it in GitHub Desktop.
Save spastorino/1329640 to your computer and use it in GitHub Desktop.
require 'thread'
mutex = Mutex.new
i = 0
t1 = Thread.new do
1_000_000.times do
mutex.synchronize do
i += 1
end
end
end
t2 = Thread.new do
1_000_000.times do
mutex.synchronize do
i += 1
end
end
end
t1.join
t2.join
puts i
#####################
➜ /tmp ruby -v
ruby 1.9.3p0 (2011-10-30 revision 33570) [x86_64-darwin11.2.0]
➜ /tmp time ruby count.rb
2000000
ruby count.rb 0.72s user 0.03s system 97% cpu 0.771 total
➜ /tmp ruby -v
jruby 1.6.5 (ruby-1.8.7-p330) (2011-10-25 9dcd388) (Java HotSpot(TM) 64-Bit Server VM 1.6.0_26) [darwin-x86_64-java]
➜ /tmp time ruby count.rb
2000000
ruby count.rb 2.72s user 1.24s system 146% cpu 2.694 total
➜ /tmp ruby -v
rubinius 2.0.0dev (1.8.7 acf6926c yyyy-mm-dd JI) [x86_64-apple-darwin11.2.0]
➜ /tmp time ruby count.rb
2000000
ruby count.rb 6.62s user 8.95s system 109% cpu 14.179 total
@spastorino
Copy link
Author

Placing the syncronize block where it should be (exchange line 7 with 8 and 15 with 16) makes it much faster on Rubinius

➜ /tmp ruby -v
ruby 1.9.3p0 (2011-10-30 revision 33570) [x86_64-darwin11.2.0]
➜ /tmp time ruby count.rb
2000000
ruby count.rb 0.17s user 0.02s system 97% cpu 0.204 total

➜ /tmp ruby -v
rubinius 2.0.0dev (1.8.7 acf6926c yyyy-mm-dd JI) [x86_64-apple-darwin11.2.0]
➜ /tmp time ruby count.rb
2000000
ruby count.rb 0.62s user 0.05s system 125% cpu 0.541 total

➜ /tmp ruby -v
jruby 1.6.5 (ruby-1.8.7-p330) (2011-10-25 9dcd388) (Java HotSpot(TM) 64-Bit Server VM 1.6.0_26) [darwin-x86_64-java]
➜ /tmp time ruby count.rb
2000000
ruby count.rb 1.16s user 0.10s system 148% cpu 0.853 total

@rgoytacaz
Copy link

JRuby on 1.9 compatible mode?

@spastorino
Copy link
Author

➜ /tmp time ruby --1.9 count.rb
2000000
ruby --1.9 count.rb 5.61s user 1.35s system 174% cpu 3.987 total

So is even slower than in 1.8.7 mode

@headius
Copy link

headius commented Nov 1, 2011

I don't understand "where it should be". Moving the synchronization to the outside of the 1_000_000 loop essentially forces things to run in serial rather than parallel.

This also is only a short run that may not be optimizing entirely in JRuby or Rubinius. If I put a 5.times loop around it and time the whole process, I get the following times for JRuby (master), 1.9.3p0, and Rubinius master:

system ~/projects/jruby $ ../rubinius/bin/rbx sync.rb 
12.351075
12.87269
12.889112
13.054452
12.968709

system ~/projects/jruby $ jruby sync.rb 
1.958
1.406
1.279
1.316
1.613

system ~/projects/jruby $ rvm use ruby-1.9.3
Using /Users/headius/.rvm/gems/ruby-1.9.3-p0

ruby-1.9.3-p0 ~/projects/jruby $ ruby sync.rb 
0.669498
0.851305
0.617735
0.706195
0.885284

This isn't too surprising for JRuby...I believe a mutex in 1.9 is much cheaper than doing a real lock, since all it has to do is keep running and tell the scheduler nobody else can acquire that lock. We can probably improve, though.

If I make your modification and move the synchronization outside the loop:

ruby-1.9.3-p0 ~/projects/jruby $ ../rubinius/bin/rbx sync2.rb 
0.231478
0.143073
0.13726
0.14088200000000006
0.145779

ruby-1.9.3-p0 ~/projects/jruby $ jruby sync2.rb 
0.311
0.133
0.13
0.133
0.131

ruby-1.9.3-p0 ~/projects/jruby $ ruby sync2.rb 
0.15189
0.150559
0.15064
0.158836
0.15974

All three implementations are in the same ballpark, though JRuby is clearly the fastest. However, you're not really measuring anything about Mutex performance anymore.

@headius
Copy link

headius commented Nov 1, 2011

Using straight Java synchronization, instead of the ReentrantLock we're currently using, JRuby's numbers improve:

system ~/projects/jruby $ jruby sync.rb 
0.951
0.775
0.81
0.802
0.812

I'll explore this and other options, but I suspect much of the overhead of using ReentrantLock is maintaining a list of waiters. This will help heavily-contended performance and allows for the potential of fairness in transferring the lock, but it also requires maintaining an internal list of threads that are waiting on the lock.

@headius
Copy link

headius commented Nov 1, 2011

Oh, last note on this.

Keep in mind that on all the implementations, you're talking about overhead in the µsec range for locking the mutex. I doubt you'll see that matter in any real-world app, especially when the block contains something more substantial than incrementing a Fixnum :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment