Skip to content

Instantly share code, notes, and snippets.

@hubertlepicki
Created September 29, 2017 11:43
Show Gist options
  • Save hubertlepicki/dc7b69b457d9187033d0e0d7c79b19fd to your computer and use it in GitHub Desktop.
Save hubertlepicki/dc7b69b457d9187033d0e0d7c79b19fd to your computer and use it in GitHub Desktop.
Strings vs Symbols ruby hash
irb(main):029:0> Benchmark.bm do |x|
irb(main):030:1* x.report("Strings: ") { 10000000.times {"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa".hash} }
irb(main):031:1> x.report("Symbols: ") { 10000000.times {:aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa.hash} }
irb(main):032:1> end
user system total real
Strings: 1.290000 0.000000 1.290000 ( 1.282911)
Symbols: 0.450000 0.000000 0.450000 ( 0.458915)
@paneq
Copy link

paneq commented Oct 2, 2017

string_var = "foo"
symb_var = :foo
Benchmark.bm do |x|
  x.report("Strings: ") { 10_000_000.times {"foo".hash} }
  x.report("Symbols: ") { 10_000_000.times {:foo.hash} }
  x.report("Strings var: ") { 10_000_000.times {string_var.hash} }
  x.report("Symbols var: ") { 10_000_000.times {symb_var.hash} }
end

#       user     system      total        real
#Strings:   1.300000   0.000000   1.300000 (  1.297051)
#Symbols:   0.510000   0.000000   0.510000 (  0.510538)
#Strings var:   0.860000   0.000000   0.860000 (  0.861275)
#Symbols var:   0.540000   0.000000   0.540000 (  0.541402)

@eregon
Copy link

eregon commented Oct 5, 2017

The benchmarks above account for allocating the Strings (except the var variants in the comment above).

But if we use different String lengths we see String#hash is O(n) but Symbol#hash O(1) and so we can get arbitrary speedups.
Of course, this is no magic, the time we gain on Symbol#hash is the time we pay when creating those Symbols and interning them (i.e. those (14-4) seconds before the benchmark starts in the length=10 000 run).

require 'benchmark'

len = Integer(ARGV[0])
n = 1_000_000
base = "a" * len
STRINGS = Array.new(n) { |i| base + n.to_s }
SYMBOLS = STRINGS.map(&:to_sym)

Benchmark.bm do |x|
  x.report("Symbol#hash") do
    SYMBOLS.each { |s| s.hash }
  end
  x.report("String#hash") do
    STRINGS.each { |s| s.hash }
  end
end
$ ruby -v string_symbol_hash.rb 10
ruby 2.4.1p111 (2017-03-22 revision 58053) [x86_64-linux]
       user     system      total        real
Symbol#hash  0.040000   0.000000   0.040000 (  0.040390)
String#hash  0.060000   0.000000   0.060000 (  0.061827)
0.49s user 0.02s system 99% cpu 0.511 total

$ ruby -v string_symbol_hash.rb 100
       user     system      total        real
Symbol#hash  0.050000   0.000000   0.050000 (  0.042057)
String#hash  0.100000   0.000000   0.100000 (  0.101787)
0.65s user 0.05s system 99% cpu 0.712 total

$ ruby string_symbol_hash.rb 1000
       user     system      total        real
Symbol#hash  0.050000   0.000000   0.050000 (  0.044034)
String#hash  0.460000   0.000000   0.460000 (  0.468324)
1.61s user 0.25s system 99% cpu 1.868 total

$ ruby string_symbol_hash.rb 10000
       user     system      total        real
Symbol#hash  0.040000   0.000000   0.040000 (  0.040466)
String#hash  4.240000   0.000000   4.240000 (  4.247260)
11.74s user 2.21s system 99% cpu 13.991 total

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment