Skip to content

Instantly share code, notes, and snippets.

@josevalim
Last active July 22, 2024 11:44
Show Gist options
  • Save josevalim/ce2f5871a96b6cbcf2c1 to your computer and use it in GitHub Desktop.
Save josevalim/ce2f5871a96b6cbcf2c1 to your computer and use it in GitHub Desktop.

Why variables in Elixir are re-bindable

Short answer

In my experience with Erlang I have found the times I want to actually re-bind a variable is much more common than when I actually want to match it.

I also enjoy the ^var approach in Elixir because it makes very explicit that I want a match to happen. It is explicit the variable was previously defined.

Furthermore, macros increases the chances of variables matching which would be a disaster if we didn't allow re-binding.

Long answer

Consider the example where you have a data and you want it to pass through a bunch of transformations in Erlang:

f(X) ->
  X1 = foo(X),
  X2 = bar(X1),
  baz(X2).

Now, imagine that you want to add something in between, it requires you to change everything that comes after the transformations in between:

f(X) ->
  X1 = foo(X),
  X2 = fab(X1),
  X3 = bar(X2),
  baz(X3).

One solution to this problem is to nest the calls:

f(X) ->
  baz(bar(fab(foo(X)))).

But I find this approach very unreadable as you need to read your code inside out (from the most inner call).

Elixir solves this in two ways. Re-binding is one of them and the |> operator is the other:

# Rebinding...
def f(x) do
  x = foo(x)
  x = fab(x)
  x = bar(x)
  baz(x)
end

# Pipe...
def f(x) do
  x |> foo() |> fab() |> bar() |> baz()
end

Notice how |> avoids the re-binding all together but make it visible the series of transformation the data goes thourgh.

Another issue that happens in Erlang is the accidental match when you try to assign a variable that already exists by mistake. Those are not frequent in Erlang but a lot more in Elixir. In order to see an example, let's talk about how Elixir compiles to Erlang AST.

A couple versions ago, a variable was represented in Elixir AST (aka quoted expression) as:

{ :var_name, 1, nil }

Where 1 represents the line number the variable was defined. This translates to Erlang AST as:

{ :var, 1, :var_name }

At some point, we decided that it would be best if the second element of Elixir's tuple was a container for metadata and not only line numbers. So now we have:

{ :var_name, [line: 1], nil }

The code that translates this to Erlang AST would be like that:

translate({ VarName, MetaData, nil }) ->
  { var, ?line(MetaData), VarName }.

Notice we created a macro ?line(MetaData) to extract the line number from the MetaData simply because we need this convenience throughout almost all of Elixir's compiler code. This macro was first implemented as follows:

-define(Line,
  case lists:keyfind(line, 1, Opts) of
    { line, Line } when is_integer(Line) -> Line;
    false -> 0
  end).

There is a huge issue with this macro. I can only use it once per function clause! If, it happens that I need to use it twice, the Line variable would be defined the first time and a match attempt would happen the second time I call ?line.

That's how I found out that Erlang macros (more like templates) are not hygienic. I have since then moved the implementation to a module that the macro calls directly.

Now, imagine this problem in Elixir: macros are very common to use! Even though we have macros hygiene in place, not supporting re- binding would trigger many accidental matches. Approaches like gensym could help solve the problem, but putting the burden in the developer (and I am not sure how efficiently they could be implemented in the Erlang VM). I think most LISP allow rebinding on LET (and in general you use less variables in LISPs) (citation required).

Notice the current aproach employed by Elixir's compiler does not affect the immutability guarantees (as you can't change the binding of a function).

Sorry for the long answer. :)

@josevalim
Copy link
Author

Since this may be a concern, one more clarification: while Elixir macros allow a macro to inject a variable into the user code, this behaviour is not encouraged and hygiene ensures by default a variable used in the macro is not going to conflict with a user variable.

@SeanTAllen
Copy link

i've never considered gensym to a burden on me as a developer. I'm interested in why you consider gensym to be a burden.

@josevalim
Copy link
Author

I personally prefer to just define the variables and let the compiler figure it out for me. From this perspective, giving "hints" with gensym is an extra burden. Even more if we consider the "hygiene" problem is not only restrict to variables, but also imports and aliases:

    defmacro sample do
      quote do
        if this?, do: that
      end
    end

The if above is the if that the sample macro knows of (i.e. the if imported from the Kernel module automatically) or whatever if available when the macro is expanded? In Elixir, it is always the first by default unless the macro does not know any if when the quote was generated. This is also part of the overall hygiene/resolution mechanism.

There is also the question if we could efficiently implement gensym in the Erlang VM. In the text above I detail how a variable in Elixir is represented by an atom by both Elixir and Erlang compilers. And since atoms are not garbage collected, excessive use of gensym may possibly lead to exhaustion of the atoms table (it is a concern, it may not be a problem after all... nonetheless I have updated the text to mention it).

However I am aware that not everyone prefers hygiene and that this is an old discussion. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment