Skip to content

Instantly share code, notes, and snippets.

Created March 1, 2015 20:07
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save anonymous/1309ff2fa6b9778780b6 to your computer and use it in GitHub Desktop.
Save anonymous/1309ff2fa6b9778780b6 to your computer and use it in GitHub Desktop.
Experiment: number of recursive method calls before running out of stack.
# ===== Environment (MRI 2.1.1) =====
RUBY_VERSION # => "2.1.1"
RUBY_PLATFORM # => "x86_64-darwin13.0"
RbConfig.ruby # => "/Users/josh/.rubies/ruby-2.1.1/bin/ruby"
`#{RbConfig.ruby} -v` # => "ruby 2.1.1p76 (2014-02-24 revision 45161) [x86_64-darwin13.0]\n"
# ===== The Hypothesis =====
# A recursive method can call itself about 5000 times before it overflows.
# ===== Test #1 =====
# Procedure:
# Count how many times we can recursively invoke a function before it explodes.
# Repeat this test with different numbers of local variables to see if locals make a difference.
#
# Resuls:
# stack_size_0_locals # => 10078
# stack_size_1_local # => 9358
# stack_size_10_locals # => 5696
# stack_size_20_locals # => 3970
#
# Conclusion:
# The hypothesis is false as the number of calls is shown to vary.
#
# Implication:
# A method with 1 long variable name can be called fewer times than a method with 1 short variable name
# I test this in Test #2, it is incorrect, see that test for analysis
#
# Interpretation:
# The number of recursive calls we can make is probably the stack size divided by the stack frame size.
# Since local variables use the same memory as the stack, they change the stack frame size,
# thus changing the number of recursive calls.
#
# Possible flaw:
# Seeing Is Believing could be affecting it:
# This file is loaded before our program runs
# https://github.com/JoshCheek/seeing_is_believing/blob/267b9e8f5a8cdff268658376bb10670de102c58f/lib/seeing_is_believing/the_matrix.rb
# It creates a SeeingIsBelieving::EventStream::Producer, which has a thread reporting events back to the main process
# https://github.com/JoshCheek/seeing_is_believing/blob/267b9e8f5a8cdff268658376bb10670de102c58f/lib/seeing_is_believing/event_stream/producer.rb
# We can check this by running without SiB.
# I did this and got the exact same numbers.
#
# Possible flaw:
# This may depend on implementation, we can check this by running on other versions.
#
# MRI: ruby 2.2.0p0 (2014-12-25 revision 49005) [x86_64-darwin13]
# Congruent with our results
# stack_size_0_locals # => 10078
# stack_size_1_local # => 9358
# stack_size_10_locals # => 5696
# stack_size_20_locals # => 3970
#
# MRI: ruby 1.9.3p545 (2014-02-24) [x86_64-darwin13.2.0]
# Congruent with our results. At 20 variables, it locked up. I repeated the experiment, same thing. I'm not sure why
# stack_size_0_locals # => 9357
# stack_size_1_local # => 8733
# stack_size_10_locals # => 5458
# stack_size_20_locals # =>
#
# RBX: rubinius 2.5.0 (2.1.0 50777f41 2015-01-17 3.5.0 JI) [x86_64-darwin14.1.0]
# Congruent with our results, though less pronounced
# stack_size_0_locals # => 3613
# stack_size_1_local # => 3585
# stack_size_10_locals # => 3347
# stack_size_20_locals # => 3118
#
# JRUBY: jruby 1.7.16 (1.9.3p392) 2014-09-25 575b395 on Java HotSpot(TM) 64-Bit Server VM 1.7.0_51-b13 +jit [darwin-x86_64]
# Congruent with our results. Note that I had to rescue Exception instead of SystemStackError.
# Its original error message: "Error: Your application used more stack memory than the safety cap of 2048K."
# stack_size_0_locals # => 3964
# stack_size_1_local # => 3673
# stack_size_10_locals # => 3244
# stack_size_20_locals # => 2974
#
# Possible flaw:
# It could store the stack in the same memory as the heap.
# This would imply that the size of the program memory determines the callstack size.
# We could check this by allocating a lot of objects, counting callstack size, freeing them, and counting again.
# If the numbers are the same, then the stack's memory is independent from the state of the heap.
# I do this in Test #3
# 0 locals
def recursive_0_locals
@count += 1
recursive_0_locals
end
def stack_size_0_locals
@count = 0
recursive_0_locals
rescue SystemStackError
return @count
end
# 1 local
def recursive_1_local
@count += 1
a = nil
recursive_1_local
end
def stack_size_1_local
@count = 0
recursive_1_local
rescue SystemStackError
return @count
end
# 10 locals
def recursive_10_locals
a = b = c = d = e =
f = g = h = i = j = nil
@count += 1
recursive_10_locals
end
def stack_size_10_locals
@count = 0
recursive_10_locals
rescue SystemStackError
return @count
end
# 20 locals
def recursive_20_locals
a = b = c = d = e =
f = g = h = i = j =
k = l = m = n = o =
p = q = r = s = t = nil
@count += 1
recursive_20_locals
end
def stack_size_20_locals
@count = 0
recursive_20_locals
rescue SystemStackError
return @count
end
# repeat 2x, in case order matters
stack_size_0_locals # => 10078
stack_size_1_local # => 9358
stack_size_10_locals # => 5696
stack_size_20_locals # => 3970
stack_size_0_locals # => 10078
stack_size_1_local # => 9358
stack_size_10_locals # => 5696
stack_size_20_locals # => 3970
# ===== Test #2 =====
# Hypothesis:
# "A method with 1 long variable name can be called fewer times than a method with 1 short variable name"
# This is implied by Test #1, above
#
# Procedure:
# Count how many times we can recursively invoke a function with a 1-letter variable name.
# Repeat this for a 100-letter variable name.
# The one with the 100-letter variable name should take more stack memory, and thus overflow after fewer calls.
#
# Resuls:
# stack_size_1_letter # => 9358
# stack_size_100_letters # => 9358
#
# Conclusion:
# The hypothesis is false as the number of calls did not vary.
#
# Implication:
# The local variable is stored internally (most likely as a symbol -- verified in Test #4),
# The local variables are then stored in a data structure (probably a hash)
# that references the symbol by address. Since the address of a 1-letter variable
# and the address of a 100-letter variable will be the same size, the stackframes
# are also the same size.
#
# Possible flaw:
# Ruby could track locals as offsets from the stackframe, as C does,
# then the variable names are irrelevant, and we would still see these results.
# We could verify this by searching object space to see if the local variables were turned into symbols.
# See Test #4 for this
# 1 letter
def recursive_1_letter
a = nil
@count += 1
recursive_1_letter
end
def stack_size_1_letter
@count = 0
recursive_1_letter
rescue SystemStackError
return @count
end
# 100 letters
def recursive_100_letters
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa \
= nil
@count += 1
recursive_100_letters
end
def stack_size_100_letters
@count = 0
recursive_100_letters
rescue SystemStackError
return @count
end
stack_size_1_letter # => 9358
stack_size_100_letters # => 9358
stack_size_1_letter # => 9358
stack_size_100_letters # => 9358
# ===== Test #3 =====
# Hypothesis:
# The stack is stored in the same meory as the heap,
# so its size depends on how much heap memory is available.
#
# Procedure:
# Count how many recursive calls we have before it overflows.
# Allocate 100,000 objects, free the objects.
# Our program should have more memory available, but not more objects consuming it.
# If we count again, the count should increase
# as we will have more memory and thus recurse further before overflow.
#
# Resuls:
# The program memory grew by 6,144kb
# The stack size was 10,078 calls regardless of the amount of memory available to the program.
#
# Conclusion:
# The hypothesis is false.
#
# Interpretation:
# The stack size is independent of the heap size, thus the above experiments are valid.
def program_size_in_kb
# `man ps` says that `rss` is "the real memory (resident set) size of the process (in 1024 byte units)"
# experimentally, `rss` is the 6th column when I run `ps aux`, so calling that to find memory usage.
%x[ps aux].lines.find { |line| line.include? Process.pid.to_s }.split[5].to_i
end
def recursive
@count += 1
recursive
end
def stack_size
@count = 0
recursive
rescue SystemStackError
return @count
end
# Initial count and program size and stack count
initial_size = program_size_in_kb # => 9092
initial_count = stack_size # => 10078
# Allocate 100k objects, verify they are freed
objects = Array.new(100_000) { Object.new }
num_objs1 = GC.stat[:total_allocated_object] - GC.stat[:total_freed_object]
objects = nil
GC.start
num_objs2 = GC.stat[:total_allocated_object] - GC.stat[:total_freed_object]
num_objs1 # => 108307
num_objs2 # => 8307
num_objs1 - num_objs2 # => 100000
# Our program should now have more memory available, but the count should not change
# note that these numbers may vary slightly with each run
program_size_in_kb - initial_size # => 6500
initial_count # => 10078
stack_size # => 10078
# ===== Test #4 =====
# Hypothesis:
# Local variables are stored as references to a symbol.
#
# Procedure:
# Look through object space to see if there is a symbol name that matches the local variable name.
#
# Resuls:
# We found the symbol.
#
# Conclusion:
# There is such a symbol, the hypothesis is not disproven.
# So it could be that locals are stored as a hash,
# this means the hashes themselves would be the same size.
# For example, say that the symbol :a is stored at memory address 0x100,
# and :aa is stored at 0x200, and :value is stored at 0x300.
# Then the hashes {a: :value} and {aa: :value}
# would internally be {0x100 => 0x300}, and {0x200 => 0x300}
# so they are the same size even though their keys are different sizes.
def has_a_local
i_am_a_local_variable = nil
end
Symbol.all_symbols.grep(/^i_am/) # => [:i_am_a_local_variable]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment