Skip to content

Instantly share code, notes, and snippets.

@josevalim
Last active August 21, 2021 10:44
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save josevalim/da8f1630e5f515dc2b05aefdc5d01af7 to your computer and use it in GitHub Desktop.
Save josevalim/da8f1630e5f515dc2b05aefdc5d01af7 to your computer and use it in GitHub Desktop.
Ranges with steps

Ranges with steps

Update #1: Use x..y//z instead of x..y..z.

This is a proposal to address some of the limitations we have in Elixir ranges today. They are:

  • It is not possible to have ranges with custom steps
  • It is not possible to have empty ranges
  • Users may accidentally forget to check the range boundaries

The first limitation is clear: today our ranges are increasing (step of 1) or decreasing (step of -1), but we cannot set arbitrary steps as in most other languages with range. For example, we can't have a range from 1 to 9 by 2 (i.e. 1, 3, 5, 7, 9).

The second limitation is that, due to how we currently infer the direction of ranges, it is not possible to have empty ranges. Personally, I find this the biggest limitation of ranges. For example, take the function Macro.generate_arguments(n, context) in Elixir. This is often used by macro implementations, such as defdelegate, when it has to generate a list of n arguments. One might try to implement this function as follows:

def generate_arguments(n, context) do
  for i <- 1..n, do: Macro.var(:"arg#{n}", context)
end

However, because n may be zero, the above won't work: for n = 0, it will return a list with two elements! To workaround this issue, the current implementation works like this:

def generate_arguments(n, context) do
  tl(for i <- 0..n, do: Macro.var(:"arg#{n}", context))
end

In other words, we have to start the range from 0 and always discard the first element which is unclear and wasteful.

Finally, another issue that may arise with ranges is that implementations may forget to check the range boundaries. For example, imagine you were to implement range_to_list/1:

def range_to_list(x..y), do: range_to_list(x, y)
defp range_to_list(y, y), do: [y]
defp range_to_list(x, y), do: [x | range_to_list(x + 1, y)]

While the implementation above looks correct at first glance, it will loop forever if a decreasing range is given.

Solution

The proposed solution is to support steps in Elixir ranges by adding ..// as a ternary operator:

start..stop//step

Where //step is optional. The idea to use a separate operator for the step, instead of start..stop..step is to make a visual distinction between the parts. Also, you think the step of a to split the range, which then makes sense to use a division-like operator.

The ternary operator solves the three problems above:

It is not possible to have ranges with steps

Now you can write 1..9//2 (from 1 to 9 by 2).

It is not possible to have empty ranges

This can be addressed by explicitly passing the step to be 1, instead of letting Elixir infer it. The generate_arguments function may now be implemented as:

def generate_arguments(n, context) do
  for i <- 1..n//1, do: Macro.var(:"arg#{n}", context)
end

For n = 0, it will construct 1..0//1, an empty range.

Note 1..0//1 is distinct from 1..0: the latter is equal to 1..0//-1, a decreasing range of two elements: 1 and 0. To avoid confusion, we plan to deprecate inferred decreasing ranges in the future.

Users may accidentally forget to check the range boundaries

If we introduce ranges with step and the ternary operator, we can forbid users to write x..y in patterns. Doing so will emit a warning and request them to write x..y//z instead, forcing them to explicitly consider the step, even if they match on the step to be 1. In my opinion, this is the biggest reason to add the ternary operator: to provide a convenient and correct way for users to match on ranges with steps.

Implementation

The implementation happens in three steps:

  1. Add ..// as a ternary operator. x..y//z will have the AST of {:..//, meta, [x, y, z]}, consequently :..// will be a valid atom and both &..///3 and &Kernel."..//"/3 will be valid captures

  2. Add the :step to range structs and implement def x..y//z do in Kernel

  3. Update the formatter to handle x..y//z accordingly

  4. Add deprecations. To follow Elixir's deprecation policy, the deprecation warnings shall only be emitted 4 Elixir versions after ranges with steps are added (most likely on v1.16):

    • Deprecate x..y as a shortcut for a decreasing range in favor of x..y//-1. The reason for this deprecation is because a non-empty range is more common than a decreasing range, so we want to optimize for that. Furthermore, having a step with a default of 1 is clearer than having a step that varies based on the arguments. Of course, we can only effectively change the defaults on Elixir v2.0, which is still not scheduled or planned.

    • Deprecate x..y in patterns, require x..y//z instead. This will become an error on Elixir v2.0.

    • Deprecate x..y in guards unless the arguments are literals (i.e. 1..3 is fine, but not 1..y or x..1 or x..y). This is necessary because x..y may be a decreasing range and there is no way we can warn about said cases in guards, so we need to restrict at the syntax level. For non-literals, you should either remove the range or use an explicit step. On Elixir v2.0, x..y in guards will always mean a range with step of 1.

@josevalim
Copy link
Author

Floats ranges. Since the step is now explicit it removes a source of ambiguity that I suspect was one reason they weren't previsouly supported.

Unfortunately the issue with floats is the precision. For example, if I do 0..2000//2.314, as we add the steps, the precision issues will accumulate. You could say this is fine... but then it becomes cumbersome for things like 231.4 in 0..2000 which should return true, but will never actually be in the result of Enum.to_list(0..2000//2.314) due to precision.

Arbitrary struct for for the range and the step (with range.first and range.last required to be the same struct

Perhaps. The question is which protocol we would require those data structures to implement. Some things, like Range.disjoint?, may be be far from trivial to make generic, unless we require all of the structs to "quack" like integers. So basically we would unpack the integers to iterate and pack them back as we iterate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment