Skip to content

Instantly share code, notes, and snippets.

@mwylde
Created July 12, 2011 07:39
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mwylde/1077565 to your computer and use it in GitHub Desktop.
Save mwylde/1077565 to your computer and use it in GitHub Desktop.
Hash.new
# Oftentimes, we want to create lists of things that have something in
# common. For example we may have a list of objects, each with a
# name. We want to count how many times each name occurs. We could do
# so with a hash, like this:
h = {}
a.each{|o|
# make sure there's a value here
h[o.name] ||= 0
h[o.name] += 1
}
# But it turns out there's an easier way, using the less common
# Hash.new method:
h = Hash.new(0)
a.each{|o| h[o.name] += 1}
# (or using a reduce to get it down to one line)
h = a.reduce(Hash.new(0)){|h, o| h[o.name] += 1; h}
# What's going on here? When we supply an argument to Hash.new, we
# provide a default value that's used for any key that's not already
# in the hash. When we omit the argument (or use the short-hand form
# {}) the default value is set to nil.
g = Hash.new(5)
h = {}
g["nancy"] = 10
g["nancy"] # => 10
g["sean"] # => 5
h["sean"] # => nil
# However, care should be taken with using normal objects as default
# values, as the following example shows:
h = Hash.new([])
h["mary"] << 5
h["abigail"] << 10
h["mary"] # => [5, 10]
h["abigail"] # => [5, 10]
# What's going on here? We're setting a particular object--an instance
# of the Array class--to the default value. However, this means that
# the exact same array is used for every index of the Hash. Needless
# to say, this probably isn't what you want. But there is another,
# more powerful form of Hash.new that can solve this exact problem:
# Hash.new {|hash, key| ... }. Using the block form, we can fix our
# above example:
h = Hash.new{|hash, key| hash[key] = []}
h["mary"] << 5
h["abigail"] << 10
h["mary"] # => [5]
h["abigail"] # => [10]
# Now we're producing a new array element for every key. This sort of
# thing can be useful if we have a list of objects each with names and
# values, and we want a list of the values corresponding to each
# name. We can now write:
h = Hash.new{|h, k| h[k] = []}
a.each{|o| h[o.name] << h.value}
# The possibilities are endless. Effectively the block form makes
# hashes into lightweight methods much like proc and lambda, with the
# added benefit that we get memoization for free. For example, lets
# say we want a function to calculate the fibonacci numbers. We all
# know the formula: f(n) = f(n-1) + f(n-2). We can compute it
# naievely like this:
def f(n)
case n
when 0 then 1
when 1 then 1
else
f(n-1) + f(n-2)
end
end
# This works fine for small values of n. f(20) takes less than three
# hundreths of a second on my MacBook Pro. But it quickly becomes
# useless: f(30) takes 0.34 seconds, but I got bored waiting for f(40)
# to finish. Is calculating the Fibbonaci sequence fundamently a hard
# problem, where we can't expect to find a good solution? Or do we
# just need to be smarter about it? Consider the following expansion,
# of the n = 5 case:
f(5) = f(4) + f(3)
= f(3) + f(2) + f(2) + f(1)
= f(2) + f(1) + f(1) + f(0) + f(1) + f(0) + f(1)
= f(1) + f(0) + f(1) + f(1) + f(0) + f(1) + f(0) + f(1)
= 8
# While we got the right answer, notice how we kept repeating the same
# calculations. Even in the trivial case of n = 5, we calculated f(3)
# twice, and f(2) three times. Since we know that larger values of the
# problem depend on the values of smaller values, we can save
# ourselves an enormous amount of time my memoizing the function,
# which simply means saving the results of our earlier runs for
# later. The Ruby hash is a natural at this. Here's a much more
# effecient and yet concise implementation of the above:
f = Hash.new do |h, k|
case k
when 0 then h[0] = 1
when 1 then h[1] = 1
else
h[k] = h[k-2] + h[k-1]
end
end
# With our new and improved Fib, we can easily compute f(5000):
def time; t = Time.now; yield ; Time.now - t; end
time{f[5000]} # => 0.033939
# I should take this time to note that there is a very simple
# iterative algorithm for fibbonaci, and you should never actually use
# a recursive algorithm to compute it. However, this technique can be
# very useful for the many recursive algorithms which can not be
# trivially made iterative.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment