blatyo/proposal.md

## proposal.md

      
    Raw
  

              proposal.md
            
          
    TL;DR Add field punning to maps in Elixir

Background

What is Field Punning?
Field Punning in Elixir
Package Implementations and Shortcomings


Proposal

Creating Maps
Creating Structs
Updating Maps
Updating Structs
Pattern Matching Maps
Pattern Matching Structs


Objections and Responses

Background

What is Field Punning?

Field punning allows a programmer to use a variable name as the key in a map literal expression. For example,
in javascript:
import {foo, bar} from 'common-names';
is equivalent to:
import {foo: foo, bar: bar} from 'common-names';
Here, the variable name, foo was used as the key, and the value was assigned to that variable.
The purpose of field punning is to reduce redundant information when reading and writing code.
It has the added benefit of reducing code verbosity. In Elixir where pattern matching is common, this
would lead to complex pattern matches being simpler to read. This is especially true in function heads
which can get long and are less readable when split over multiple lines.
Field Punning in Elixir

Field punning in Elixir is a very common request (1, 2, 3, 4, 5). I created a spreadsheet to help me summarize the differing viewpoints. In my review of previous conversations, more people support adding this feature to the language than those who oppose it. Even people who did oppose the feature agreed that there would be many cases where they have code that would be able to take advantage of this feature. People who did oppose it expressed the following concerns:

It can be implemented as a package.
Atom and string key confusion.
Proposed syntax similar to tuples.
The variable name becomes meaningful.
It is less explicit. Seems like magic.
Create and update of maps is cryptic with field punning.

1, I will address next. 2 and 3, I will attempt to address in the proposal by using different syntax than what has previously been suggested. The rest I will address last.
Package Implementations and Shortcomings


short_maps
shorter_maps
shorthand
synex

short_maps and shorter_maps both use sigils to express a pattern match of a map with field punning. The problem with the sigil approach is that the macro is passed a string. For example, ~m(foo bar) would pass "foo bar" as a string to the ~m macro. That means any feature supported by the sigil has to be manually parsed and then converted into the desired code. So:
~m(%Foo bar ^baz)a = some_map
would require parsing %Foo, bar, ^, and baz and turning that into
%Foo{bar: bar, baz: ^baz} = some_map
short_maps does all that, but stops short of implementing nesting and allowing non-punned fields. shorter_maps takes things a bit farther and implements more features such as nesting, non-punned fields, and map updates. The result is two libraries that reimplement a subset of the elixir syntax, but can't use the AST. This makes the implementation difficult and full of pitfalls and caveats.
shorthand and synex do not use sigils and can take advantage of the Elixir AST. The caveat with shorthand is that in order to provide syntax like m(foo, bar) it has to generate the m/1, m/2, m/3, m/N versions of the macros because each variable passed is a separate argument in Elixir. So, if you wanted to create a map of N+1 elements, you would get an error. The workaround in the library is to wrap the elements in a list. So, you would have something like:
m([foo, bar]) = my_map
synex has the same constraints as shorthand. One benefit is that it is able to utilize more of the standard Elixir syntax, by requiring things to be wrapped in a map. For example:
map = keys(%{map | a, b, c: 100, d: 200})
keys(%{^a, ^c}) = %{a: 1, b: 2, c: 3}

All the libraries, in my opinion, suffer from not looking like a map. This makes it harder to understand what is going on. Sigils look slightly more like the syntax you might expect, but suffer from having to reimplement Elixir's AST. Macro's are able to take advantage of the AST, but make pattern matching much harder to understand, because it looks like a function call and it is not clear what the macro does. All libraries also have the disadvantage that you must import them everywhere you wish to use this feature. Additionally, I believe (I haven't tested this yet), ExUnit can't display useful error messages when using any of these libraries in an assertion.
Proposal

I propose we add two modifiers to the map syntax a and s for atom and string respectively. The presence of the modifier specifies the type of key used for field puns.
x = 1
y = 2

%{x, y}a #=> %{x: 1, y: 2}
%{x, y}s #=> %{"x" => 1, "y" => 2}
When no modifier is specified, the default modifier a is assumed as the default. However, using punning without a modifier will produce a warning suggesting a modifier be specified. The warning would mention both modifiers in case string keys were intended. This would help people who just try to use field punning to discover the correct usage in Elixir. The code formatter would fix this warning for the user by adding a, when no modifier is specified. Atom keys are chosen as the default because they are the dominant key type in Elixir.
When field punning is not being used, but a modifier is specified, Elixir will emit a warning saying that the modifier is not necessary. The code formatter will remove the modifier.
Field puns can be used with normal key value pairs. However field puns must come first. This makes field puns behave similar to Keyword lists and function arguments.
map = %{x, y: 2}a #=> %{x: 1, y: 2}

# This error could likely be improved to be more helpful.
map = %{y: 2, x}a #=> ** (SyntaxError) syntax error before: x
This change will require the elixir syntax be changed to parse the modifiers and the %/2 and %{}/1 kernel special forms be updated to support field punning and the modifiers. Additionally, the code formatter would be updated to consider the new syntax.
The following sections go into more detail about specific map constructs in Elixir and how they will be adapted to support field punning.
Creating Maps

Existing map declarations will work the same with no warnings or modifications required.
x = 1
y = 2
# No punning
%{x: x, y: y} #=> %{x: 1, y: 2}
In order to declare a map with atom keys, you should add the a modifier. If you do not, the code will still work, but produce a warning suggesting the adding of a modifier.
# Punning with atom keys
%{x, y}a #=> %{x: 1, y: 2}

# Punning with atom keys (default)
%{x, y, z: 3} #=> Warning: field punning requires ...
              #=> %{x: 1, y: 2, z: 3}
The a modifier does not prevent you from specifying key value pairs with string keys.
%{x, y, "z" => 3}a #=> %{x: 1, y: 2, "z" => 3}
In order to declare a map with string keys, the s modifier must be specified. Otherwise, the default behavior described above occurs.
# Punning with string keys
%{x, y, "z" => 3}s #=> %{"x" => 1, "y" => 2, "z" => 3}

The s modifier does not prevent you from specifying key value pairs with atom keys.
# Punning with string keys
%{x, y, z: 3}s #=> %{"x" => 1, "y" => 2, z: 3}

Maps with punning can be nested. The modifier on a map only applies to that level of nesting. For example, using the a modifier on the top level map does not make the a apply to nested maps. They must each specify their modifier in order to not receive a warning about specifying a modifier. Likewise, there is no requirement that nested maps use the same modifier as their parent.
%{x, nested: %{y}a}a #=> %{x: 1, nested: %{y: 2}}

%{x, nested: %{y}}a #=> Warning about nested map not having modifier
                    #=> %{x: 1, nested: %{y: 2}}

%{x, nested: %{y}s}a #=> %{x: 1, nested: %{"y" => 2}}
Creating Structs

Existing struct declarations will work the same with no warnings or modifications required. The a modifier will be the default with structs and Elixir will not emit a warning when punning is used in structs. If the a modifier is specified, Elixir will emit a warning saying the a modifier is not necessary. The code formatter will remove the a modifier if it is present. This is intended to help people who convert a map to a struct. The s modifier will not be valid for structs, since structs can only have atom keys.
%Point{x, y} #=> %Point{x: 1, y: 2}

%Point{x, y}a #=> Warning: ...
              #=> %Point{x: 1, y: 2}

%Point{x, y}s #=> ** (ArgumentError) ...
The properties described for nested maps applies to structs as well and either may be nested inside one another.
Updating Maps

Maps can be updated by using the map update syntax %{map | key: value}. Field punning will also be possible in map updates. When field puns are used, you should specify a modifier as described above and the field puns must precede any key value pairs.
point = %{x: 1, y: 2}
point2 = %{"x" => 1, "y" => 2}
x = 3

%{point | x} #=> Warning: ...
             #=> %{x: 3, y: 2}

%{point | x}a #=> %{x: 3, y: 2}
%{point2 | x}s #=> %{"x" => 3, "y" => 2}

%{point | x, y: 4}a #=> %{x: 3, y: 4}
%{point2 | x, "y" => 4}s #=> %{"x" => 3, "y" => 4}

%{point | y: 4, x}a #=> ** (SyntaxError) syntax error before: x
%{point2 | "y" => 4, x}s #=> ** (SyntaxError) syntax error before: x
If the key is not already present in the map, it will fail with the same error message that exists today.
Updating Structs

When using the map update syntax for structs, the rules for maps also apply. Because structs can only have atom keys, you should use the a modifier. Map update will be unaware that it is updating a struct, so it will not error on an invalid s modifier being present. Instead the update will fail on inserting the string keys.
point = %Point{x: 1, y: 2}
x = 3

%{point | x} #=> Warning: ...
             #=> %Point{x: 3, y: 2}

%{point | x}a #=> %Point{x: 3, y: 2}
%{point | x}s #=> ** (KeyError) key "x" not found in: ...

%{point | x, y: 4}a #=> %Point{x: 3, y: 4}

%{point | y: 4, x}a #=> ** (SyntaxError) syntax error before: x
%{point | "y" => 4, x}s #=> ** (SyntaxError) syntax error before: x
When using the struct update syntax on a struct, no modifier needs to be present. The a modifier will issue a warning that the a modifier is unnecessary and will be removed by the code formatter. The s modifier will produce an error.
point = %Point{x: 1, y: 2}
x = 3

%Point{point | x} #=> %Point{x: 3, y: 2}

%Point{point | x}a #=> Warning: ...
                   #=> %Point{x: 3, y: 2}
%Point{point | x}s #=> ** (ArgumentError) ...

%Point{point | x, y: 4}a #=> %Point{x: 3, y: 4}

%Point{point | y: 4, x}a #=> ** (SyntaxError) syntax error before: x
%Point{point | "y" => 4, x}s #=> ** (SyntaxError) syntax error before: x
Pattern Matching Maps

In order to support field puns, the a and s map modifiers may also be used anywhere a pattern match may occur (left of =, function head, case, etc). In addition, the pin operator (^) will also be allowed with field puns. When the pin operator is used, the value of the variable will be pinned and the variable name will be used as a literal. In order to pin a variable for match with a key, field puns cannot be used.
x = 1
point = %{x: 1, y: 2}

%{^x, y} = point #=> Warning: ...
                 #=> x is matched and y is bound to 2

%{^x, y}a = point #=> x is matched and y is bound to 2
When a pinned field pun variable is not present in the scope, it will raise an error.
%{^z}a = %{z: 1} #=> unknown variable ^z. No variable "z" has been defined before the current pattern
When a match error occurs with a pinned field pun, it will raise the same error used today for key value pairs.
z = 2
%{^z}a = %{z: 1} #=> ** (MatchError) no match of right hand side value: %{z: 1}

%{^z} = %{z: 1} #=> Warning: ...
                #=> ** (MatchError) no match of right hand side value: %{z: 1}
When the key type is mismatched in a pattern match, the same error that is produced for key value errors will be used.
z = 1
%{^z}s = %{z: 1} #=> ** (MatchError) no match of right hand side value: %{z: 1}
Ideally in the future Elixir could add a more helpful error message about string and atom key mismatches for both field puns and key value pairs.
Pattern Matching Structs

Structs will behave similar to maps, but the a modifier will not be necessary.
x = 1
point = %Point{x: 1, y: 2}

%Point{^x, y} = point #=> x is matched and y is bound to 2

%Point{^x, y}a = point #=> Warning: ...
                       #=> x is matched and y is bound to 2

%Point{^y} = point #=> unknown variable ^y. No variable "y" has been defined before the current pattern

x = 2
%Point{^x} = point #=> ** (MatchError) no match of right hand side value: %Point{x: 1, y: 2}
Objections and Responses

TODO