Skip to content

Instantly share code, notes, and snippets.

@fxn
Last active June 8, 2024 22:12
Show Gist options
  • Save fxn/bf4eed2505c76f4fca03ab48c43adc72 to your computer and use it in GitHub Desktop.
Save fxn/bf4eed2505c76f4fca03ab48c43adc72 to your computer and use it in GitHub Desktop.

Ruby: The future of frozen string literals

What is a literal?

In programming languages, literals are textual representations of values in the source code. This is a syntactical concept.

Some examples:

7     # integer literal
'foo' # string literal
[]    # array literal

In contrast,

Math::PI
String.new
Array.new

are not literals.

In the case of Math::PI, while it may store a fixed number, syntactically that is a constant path, not a literal.

Literals and object allocations

Everything is an object in Ruby, does Ruby create a new object when it encounters a literal?

It depends, there are three possibilities:

1. You get the same object for the same literal everywhere

This happens with nil, true, false, symbols, small integers (fixnums), and others:

p 7.object_id # => 15
p 7.object_id # => 15

As the example illustrates, 7 evaluates always to the same object.

2. You get the same object for the same literal in the same spot

This happens with literals for regular expressions or rational numbers, for example. Check this out:

def m1 = 0.5r
def m2 = 0.5r

2.times { p m1.object_id } # prints the same object ID twice
2.times { p m2.object_id } # prints the same object ID twice, but a different one

0.5r is the same literal for the fraction 1/2 in both methods. You get the same object every time you invoke m1. You also get the same object every time you invoke m2. But those two objects are different, because the literals are located in different places.

3. You always get different objects

This happens for example with arrays, hashes, or strings (by default):

2.times { p ''.object_id } # prints different object IDs

However, in practice, we'd often prefer string literals to behave as in (1), is that possible?

The magic comment

Yes, Ruby 2.3 introduced this magic comment:

# frozen_string_literal: true

If a file has it at the top, string literals in that file evaluate to frozen (immutable) string instances:

# frozen_string_literal: true

s = 'foo'
s.frozen?       # => true
s.equal?('foo') # => true

Additionally, string literals behave as in (1) now, as the last line shows. That is, any 'foo', anywhere, evaluates to the same object.

This is important because it reduces allocations and, therefore, reduces the time spent in garbage collection. No big deal if the string is used to initialize a constant, but it might be for those in method definitions, for example.

Impact depends on the application but, in general terms, this is more performant. For instance, the Lobsters benchmark is about 5% slower with frozen string literals disabled.

Let me underline that this optimization applies only to frozen string literals, not to arbitrary frozen strings:

# frozen_string_literal: true

s = String.new('foo').freeze
t = String.new('foo').freeze

s.equal?(t) # false because while 'foo' is a literal, String.new('foo') is not

The vision

The Ruby community has fully embraced this feature, and modern codebases normally have that magic comment in all their files. To the point that we would like to have (1) by default, without the need of the magic comment.

That possibility was discussed for Ruby 3, but Matz considered that the ecosystem was just not ready (see #11473).

The goal is to make the switch in Ruby 4.

Ruby 3.4

Ruby 3.4 is going to ship with a new feature that will help making the transition.

Ruby committer (and Rails Core Team member) Jean Boussier is championing this effort. To me, that is admirable, this epic needs determination.

In Ruby 3.4, by default, if a file does not have the magic comment and a string object that was instantiated with a literal gets mutated, Ruby still allows the mutation, but it now issues a warning:

s = 'foo'
s << 'bar' # warning: literal string will be frozen in the future

The mutation does not need to happen in the same file, it can happen elsewhere.

Deprecation warnings have to be enabled to see them. For example, by passing -W:deprecated to ruby, or by setting Warning[:deprecated] = true. It is worth noting that nowadays minitest has deprecation warnings enabled. RSpec does not have them enabled, though there is a pull request for it. In any case, you can just add Warning[:deprecated] = true to spec/spec_helper.rb.

You can tell ruby to err instead of warn with --enable-frozen-string-literal. With that option, string literals are frozen by default globally, without magic comments (that is, unless you opt-out manually with # frozen_string_literal: false).

As a curiosity, in the current 3.4.0-preview1, s.frozen? returns true, even if the string is mutable. This was subject to discussion, and it has been revised, in 3.4.0 it will return false.

Can I delete the magic comments in Ruby 3.4?

In general, no.

By default, if you delete the magic comments in Ruby 3.4, the optimizations you enabled with the comment are disabled. As we saw, strings are not frozen, and string objects are not reused.

You could get frozen string literals by passing --enable-frozen-string-literal to ruby, but since that has a global effect, right now that can be risky in production due to transitive dependencies.

On the other hand, gems supporting Ruby < 4 may want to leave the magic comment in place for now. If they remove the comment, clients running in those Rubies without --enable-frozen-string-literal will lose the optimizations. Furthermore, string literals in your gem would all of a sudden evaluate to mutable objects, which is in itself a logic concern if the code relied on them being immutable.

How to help?

In order to be able to have frozen string literals by default in the future, gems have to be ready for the switch. As much as possible.

This is going to be a community effort 💪.

To help in this transition, you can enable warnings in CI and note which gems issue warnings. Then, report them to the gem maintainers.

The flag --debug-frozen-string-literal helps, because it reports the locations of both the allocation and the mutation.

Basic GitHub Actions configuration would be something like:

- run: "RUBYOPT='-W:deprecated --debug-frozen-string-literal' bundle exec rake"

Once warnings are clean, you can keep an eye on this by enabling errors:

- run: "RUBYOPT='--enable-frozen-string-literal --debug-frozen-string-literal' bundle exec rake"

Thanks

We have polished this post together with Jean Boussier, thanks man.

@mezbahalam
Copy link

thank you! @fxn

@hack3rvaillant
Copy link

Nice post. Thank you!

@dougc84
Copy link

dougc84 commented May 23, 2024

The one question that has yet to be answered for me is why would you want to do this globally?

String concatenation and pushing to an existing string is a common pattern in almost every 3rd or 4th gen language. I understand the performance benefit, but this seems to remove a common feature and make development more tricky just to save 5%?

@fxn
Copy link
Author

fxn commented May 23, 2024

@dougc84 regarding concatenation,

'a' + 'b'

gives you a mutable string, because the expression is not a literal. The operands are literals, but not their addition.

Sometimes you want a string buffer to push to it, yes, but statistically that use case is less frequent, so we optimize for the common and most performant case. Still, the options are simple:

# Options to get new and mutable strings with frozen string literals enabled.
+''
''.dup
String.new

About other languages, it depends. In some languages all strings are immutable, Java is an example.

@yoniamir
Copy link

@dougc84 if you think about all the compute power that ruby takes globally, 5% is a LOT!

@Mth0158
Copy link

Mth0158 commented May 23, 2024

What a change! I am currently writing articles about this kind of performances topics, it feels so right to have this change finally in 3.4.

@maxence33
Copy link

maxence33 commented May 23, 2024

Strings getting closer to symbols. May I ask if we really still need symbols anymore ..

@americos
Copy link

Great post Xavier, probably one of the best explanations on Ruby literals I have read. Thanks.

@alexandrule
Copy link

Thanks @fxn

@drgcms
Copy link

drgcms commented May 28, 2024

If I understand correctly, there is no performance advantage if the source file doesn't contain # frozen_string_literal: true magic comment.

@fxn
Copy link
Author

fxn commented May 28, 2024

@drgcms in the general case, it does have a performance impact.

If frozen string literals are not enabled, methods like

def hidden?(basename)
  basename.start_with?(".")
end

that have a string literal in them create a new string object in each call. That means 1) we are doing the work of creating a new object in each call, and 2) the garbage collector has to get rid of them. If you pass the magic comment, none of that happens.

You can see the impact by yourself with this script, for example:

require "benchmark"

def hidden?(basename)
  basename.start_with?(".")
end

puts Benchmark.measure {
  i = 0
  while i < 100_000_000
    i += 1
    hidden?(".foo")
  end
}

If you add the magic comment, you'll see the script runs about twice as fast.

@chaadow
Copy link

chaadow commented May 29, 2024

@fxn Thanks for sharing this!

This helped me clean some of my application code as well as opening PRs on other gems.

However I have an issue with this:

- run: "RUBYOPT='-W:deprecated --debug-frozen-string-literal' bundle exec rake"

This seems to only work for a script that contains the pragma # frozen_string_literal: true
here are three cases:
image

From my testing:

  • adding -W:deprecated does not differ with '-W:deprecated --debug-frozen-string-literal'
  • the debug frozen string literal option, only adds the location BUT it still needs frozen_string_literal: true and it does not issue any warning

here is another screenshot, with the same '-W:deprecated --debug-frozen-string-literal' but with no frozen_string_literal: true pragma

image

==> Nothing happens :/

here is the script used for reference ( Using ruby 3.3.1 )

# frozen_string_literal: true

def modify_string
  str = "immutable"
  str << " change"
end

modify_string

@fxn
Copy link
Author

fxn commented May 29, 2024

@chaadow ya, the warning is a new feature in the forthcoming Ruby 3.4. If you install 3.4.0-preview1 you'll be able to experiment with it.

@chaadow
Copy link

chaadow commented Jun 8, 2024

@fxn Thanks. so I went ahead and installed 3.4.0-preview1

  • when I add --debug-frozen-string-literal the warning is never shown
  • without it, it works ( screenshot below)

Also on ruby's website there is no mention of the --debug-frozen-string-literal flag. Maybe it was removed recently?

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment