Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?
Strings vs Symbols ruby hash
irb(main):029:0> Benchmark.bm do |x|
irb(main):030:1* x.report("Strings: ") { 10000000.times {"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa".hash} }
irb(main):031:1> x.report("Symbols: ") { 10000000.times {:aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa.hash} }
irb(main):032:1> end
user system total real
Strings: 1.290000 0.000000 1.290000 ( 1.282911)
Symbols: 0.450000 0.000000 0.450000 ( 0.458915)

paneq commented Oct 2, 2017

string_var = "foo"
symb_var = :foo
Benchmark.bm do |x|
  x.report("Strings: ") { 10_000_000.times {"foo".hash} }
  x.report("Symbols: ") { 10_000_000.times {:foo.hash} }
  x.report("Strings var: ") { 10_000_000.times {string_var.hash} }
  x.report("Symbols var: ") { 10_000_000.times {symb_var.hash} }
end

#       user     system      total        real
#Strings:   1.300000   0.000000   1.300000 (  1.297051)
#Symbols:   0.510000   0.000000   0.510000 (  0.510538)
#Strings var:   0.860000   0.000000   0.860000 (  0.861275)
#Symbols var:   0.540000   0.000000   0.540000 (  0.541402)

eregon commented Oct 5, 2017

The benchmarks above account for allocating the Strings (except the var variants in the comment above).

But if we use different String lengths we see String#hash is O(n) but Symbol#hash O(1) and so we can get arbitrary speedups.
Of course, this is no magic, the time we gain on Symbol#hash is the time we pay when creating those Symbols and interning them (i.e. those (14-4) seconds before the benchmark starts in the length=10 000 run).

require 'benchmark'

len = Integer(ARGV[0])
n = 1_000_000
base = "a" * len
STRINGS = Array.new(n) { |i| base + n.to_s }
SYMBOLS = STRINGS.map(&:to_sym)

Benchmark.bm do |x|
  x.report("Symbol#hash") do
    SYMBOLS.each { |s| s.hash }
  end
  x.report("String#hash") do
    STRINGS.each { |s| s.hash }
  end
end
$ ruby -v string_symbol_hash.rb 10
ruby 2.4.1p111 (2017-03-22 revision 58053) [x86_64-linux]
       user     system      total        real
Symbol#hash  0.040000   0.000000   0.040000 (  0.040390)
String#hash  0.060000   0.000000   0.060000 (  0.061827)
0.49s user 0.02s system 99% cpu 0.511 total

$ ruby -v string_symbol_hash.rb 100
       user     system      total        real
Symbol#hash  0.050000   0.000000   0.050000 (  0.042057)
String#hash  0.100000   0.000000   0.100000 (  0.101787)
0.65s user 0.05s system 99% cpu 0.712 total

$ ruby string_symbol_hash.rb 1000
       user     system      total        real
Symbol#hash  0.050000   0.000000   0.050000 (  0.044034)
String#hash  0.460000   0.000000   0.460000 (  0.468324)
1.61s user 0.25s system 99% cpu 1.868 total

$ ruby string_symbol_hash.rb 10000
       user     system      total        real
Symbol#hash  0.040000   0.000000   0.040000 (  0.040466)
String#hash  4.240000   0.000000   4.240000 (  4.247260)
11.74s user 2.21s system 99% cpu 13.991 total
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment