Elm 0.19 removes the syntax for user-defined operators. So it is no longer possible to define things like @@@|>
or <#>
.
This document presents the reasoning behind this decision.
It is extremely difficult to come up with good operators. I think all of the successful ones have two things in common:
-
They have some visual relation to their meaning. For example,
</>
and<?>
from the URL parsing library exactly mimic the/
and?
symbols in URLs. And|>
indicates directionality really clearly. The math operators like+
and/
are directly related to the math operators taught in every school in the world. -
They use the visually simple symbols. I have never seen
#
or%
or@
or$
used to create an excellent operator. I think it is partly that they are so busy visually, mixing curves and lines haphazardly. It is also partly that they do not have super strong cultural meanings as infix operators in general. E.g.$4
means four dollars, but what is3$4
?
Between these two things, there are not actually very many viable possibilities, and all of them are taken. Visual bracketing like </>
or |+|
expands things a bit, allowing reuse of symbols that already have cultural meaning, but at that point, you are basically constrained to making operators for vector math or something.
Elm has had this feature for a while, so I studied how it has been used so far in packages.
As of this writing, there are 1031 packages published for Elm 0.18 and 66 of them have ever defined operators. Core developers who agree with this choice account for about 15 of those, so there are 51 unaccounted for. That means that under 5% of packages are affected.
I want to highlight some of the usage trends with that 5%:
-
Haskel Operators - Some authors really like the following operators:
>>=
,<$>
,<*>
,<*
,*>
,>=>
, etc. This seems to be one of the more popular uses. -
Invented Operators - A fairly small fraction of authors invent very elaborate operators. For example, one package contains
|-~->
,|-~>
,|=~->
,|=~>
,|~->
,|~>
, and many others. Another has@@@|>
,?|>
,!+>
, and many others. This is pretty uncommon. -
Math Operators - Some packages are for math. Vector math. Matrix Math. They generally use operators like
|+|
or|*|
that match the cultural norms. This is not as common as I would have hoped actually! More math! -
Parser Operators - Parsers in the ML-family of languages can be really lovely. All of the parsing packages I have seen in Elm have special syntax of some sort.
I may be missing some scenarios, but those were the ones that stood out to me.
Each of these categories stems from a reasonable design goal. I will try to outline the design goal, and then point a path towards acheiving that goal in a nicer way:
-
"I want it to be easier to chain tasks." Haskell has this special syntax called
do
-notation for sequencing tasks. It is pretty neat once you know it, but it it also a major barrier to entry. I struggled with it for about six months at least. Haskell also has a set of operators like>>=
that is connected to this special syntax, so I suspect folks settle for having that as a personal compromise. Now, I think other languages have actually accomplished the root goal of "chaining tasks is easy" in nicer ways. For example, F# has computation expressions that are a bit more flexible and do not require integration with a type class mechanism. And C# introducedasync
/await
syntax that gives the same capabilities, but integrates with the language way more cleanly. And Idris generalized that with bang-notation. So I feel thatandThen
vs>>=
is missing the bigger picture here. -
I want to write fewer characters. I think this explains the "invented operators" case, but the line between "consise" and "cryptic" is a matter of taste. Is APL concise or cryptic? Is Ruby concise or cryptic? Is Haskell concise or cryptic? Depends who you ask! In Elm, having explicit and readable code is a major design goal.
elm-format
helps with that. Using qualified values likeList.map
andSet.map
helps with that. So even if you have the|-~->
operator, Elm is not really designed for minimizing character count and will clash with your root goals in other ways. -
I want to do math! I really like this goal. I think languages like Julia have done an excellent job at overloading
+
and*
in a reasonable way. Their approach is really lovely, but we would have to lose Elm’s type system to match them. Point being, rather than making|+|
and|*|
as a stopgap, perhaps it is possible to think about the broader question in a comprehensive way. Should there be a way to overload+
for vector and matrix math? How would that work? How would you multiply a vector by a scalar? Perhaps the best design is to restore user-defined operators for bracketed math operations like|+|
and|-|
with certain types? Or maybe it can just be done in a really nice way with a library. Worth exploring! -
I want to parse! Writing parsers can be tricky, and operators are one way to help make things easier. Most parsing packages replicate the Haskell operators, and all of the logic in case (1) applies. Separate from that, it seems like
</>
and<?>
have been quite successful inelm-lang/url
, and it seems that|.
and|=
have been quite successful inelm-lang/parser
. These operators are getting special cased like+
and-
. What is the deal with that?
Well, many languages special case parsers (e.g. regex in Perl, JS, and Ruby) with very specific costs and benefits. The cost is that if regex cannot handle your scenario, it is very annoying. The benefit is that there is specific knowledge that transfers between different codebases and languages. I think Haskell is the best example of a language that does not special case parsers, also providing specific costs and benifits. The cost is that everyone has to pick betweenparsec
for okay error messages andattoparsec
for better perf. Any project I ever created ended up using bothparsec
andattoparsec
through transitive dependencies. (That was frustrating when I was trying to getelm
binaries smaller, but it is a much bigger deal for JS bundles where size is super important!) Even though the API forparsec
andattoparsec
are pretty much identical at a high-level, code does not transfer between because the details are slightly different. On the other hand, the benefit is that you can get a parser tailored for your exact performance or error message needs, and if there is some new insight, someone can make a new parser library around it.
The design ofelm-lang/parser
uses the same performance insights fromattoparsec
and improves upon the error message quality ofparsec
. It appears to be possible to have both under one API. I also think it makes the most sense for the ecosystem to have one option that distills the best known path. Exploration can still happen (I do my exploration in Haskell because it is great for that) and insights can be brought back without fragmenting the Elm ecosystem.
The broader message here is that we have some fairly specific design problems, and user-defined operators are often stopgap measures. I think it is important to think about languages on a timescale of decades, and by looking at each case directly, I think we can end up with something nicer in the long run.
I know some folks also define operators in their application code. Some people really love custom operators. Some people really hate them. We have found with elm-format
that just making a choice is an effective way to help teams minimize these debates and focus more on the application.
This case is a bit borderline for me though, especially if you are working on your own. One thing I learned from discovering The Elm Architecture is that it is really lovely to be able to show up in any codebase and know what is going on. I think custom operators detract from that enough that they are not worth it for the whole ecosystem, even if they are great for specific individuals.
But why was this feature added in the first place? As far as I can remember, this is how I implemented +
and -
while I was working on my thesis. This was a naturally exploratory time, and features did not undergo as much scrutiny as they do now.
When assessing features from that time, I ask myself, "If someone proposed adding this feature today, would it get in?" I cannot see user-defined operators getting in. All of the considerations in this document seem to point to there being more specific problems that would benefit from a more specific designs.
I know not everyone agrees with this choice, but I hope this document clarifies some of the thinking behind it.
First, i love how you extensively help new users, keeping the language pure at the same time. I support this because i am a new user and because i am interested in learning how to solve problems functionally. Haskell is not easy enough to use for me – especially the API docs are not helpful. I love how you tell people to provide concrete practical code examples. How you ask them to use one word for one concept. I love that after the installation there is just an elm.exe, an uninstaller, an icon and two scripts for adding/removing it to the PATH. I love that
elm init
brings me to a site where everything is explained ('this file does this, that file does that, put your files here'), especially all the (political but very true) statements after 'How do I structure my directories?'. I hate setting up things as well as premature modularisation. It is a big plus to keep things simple for the beginners. The only solution to complexity is simplicity (Well, and grouping dependent things together in folders [1] :-) ).However, regarding this issue (i want to have shortcuts and infix notation for frequently used functions, but i also want it make easy for new users to read the code) the following looks like a better solution to me:
How to define operators
Illegal syntax (legal in 0.18), just the operator but no readable function name provided:
Legal syntax, the readable function name must come first:
Illegal syntax, the readable function name must be used during the definition. It helps new readers to understand the code:
Usage
This code ...
... formatted with
elm format <file>
just keeps the core operators:With
elm format <file> --use-operators
it will use all available operators:Usage on websites
On websites like elm packages: in code examples: top right:
[x] use operators
(when disabled (default), formats the source without--use-operators
). Alternatively, show the code formatted without--use-operators
when hovering an operator.Python has it
Python (created by Guido van Usability) has operator overloading in the language. One may not like the Python (i do) but one has to appreciate that the Pythonists (or at least their VIPs) try to keep the language simple to use and easy to learn. So if operator overloading is allowed in Python it must have a point. The use case given by Valentin Shirokov above is one of those points. I can not think of a more elegant way to do it, except using macros. Also, despite the fact that this feature is possible, it is not widely used in Python libraries. No one complains about Python being difficult to read because of operator overloading, at best they do because it could happen, but it doesnt. If you want to restrict usage of cryptic operators – very understandable, see below, this in my opinion includes some operators you use in the core language – just allow a restricted set of operators to be created/overloaded, like those seen in Python plus my exceptions from below, plus
.
(function composition), plus (to make Valentin happy)?
, whatever this operator means.Personal, language indepent observations about operators
Dont claim that operators like
|>
<|
</>
<?>
are readable. I wont believe you. To me no operator having more than one character, which has one of<>?|^´§%*~@
in it, is readable. Exceptions are==>
(from this follows, evaluates to),>>
or<<
(shove into),>>>
(print),<<<
(user input),<type name>
,::
(has type) and the usual boolean, comparision and assignment operators. I wont even include->
or<-
(returns thing) in this list of exceptions, it is too generic. It just points to the direction but contains no other information (Edit: actually they fit well into Haskell style type descriptions). Further, please, everybody, always, everywhere, avoid the usage of the following operators:$
(except for currency)?
(except maybe for not implemented),~
(except maybe for cast? I dont even like it in boolean logic). Further, avoid the usage of%
in string formatting, i hate it. Use{}
instead. Further, i dont like|
s at line start and\
or/
at line end. This should, if possible, be done with indentation and with line break and indent consuming separators and keywords like,
,and
,or
). Further, brackets are underrated. They are the ultimative tool for documenting a context. I love them and i put them on single lines (f you, schemers :-) ). I would never use a$
when i can use brackets.a $ foo $ bar
instead ofbar(foo(a))
is not more readable.a >> foo >> bar
however is.[1] use version 4.9 of that editor, it is the last which supports indentation sensitive languages.