As general concept, iteration can be defined as the process of doing some repeated action over something in order to generate a sequence of outcomes. At a more detailed and programmatic level, iteration is the process of visiting all the elements of an iterable object using another object known as an iterator.
Simply put, an iterable is anything that can be iterated over, meaning we can
visit each of its elements one by one in a sequential manner. For example,
lists, arrays, hashes, etc. in Raku are iterables. In general, a Raku type can
be thought as iterable if it composes the
Iterable
role and implements the
iterator
method
which is supposed to return an Iterator
object.
For instance, arrays are iterable data structures because the
Array class inherits from the
List class which in turn composes the
Iterable
role. If you'd like to know if a given object implements a given
role, you'd need to ask if the object conforms to that role with the does
method. Thus, for the Iterable
role:
my @nums = 1, 2, 3;
@nums.^name; #=> Array
@nums.does(Iterable) #=> True
In fact, we don't need to instantiate the type object (e.g., List
, Array
,
Int
, etc.) to check if it composes the role, we can ask it about it directly:
List.does(Iterable); #=> True
Array.does(Iterable); #=> True
Int.does(Iterable); #=> False
Str.does(Iterable); #=> False
As you can see, unlike in other programming languages, Raku Str
ings
cannot be iterated over (e.g., with a for
loop); the Str
type doesn't
compose the Iterable
role. However, they can be iterated over by using
the comb
routine which searches the
input string for matches and returns a Seq of
non-overlapping matches limited to a number of matches. For instance:
say "foobar".comb.raku; # OUTPUT: "("f", "o", "o", "b", "a", "r").Seq"
say "foobar".comb(2).raku; # OUTPUT: "("fo", "ob", "ar").Seq"
say "foobar".comb(2, 1).raku; # OUTPUT: "("fo",).Seq"
say "foobar".comb(/o/).raku; # OUTPUT: "("o", "o").Seq"
.say for "foobar".comb; # OUTPUT: "foobar"
An iterator is an object that enables us to iterate over an Iterable
object.
In Raku, a type is called an iterator if it composes the
Iterator
role and implements
the pull-one
method
which is supposed to either return the next value in the collection of values
or return the sentinel value IterationEnd
to indicate that all values have
been exhausted and nothing else can be produced from the collection.
We know arrays are iterables, but are they also iterators? The answer is no.
This is clear from the fact that the Array
class doesn't compose the
Iterator
role and therefore neither does an object of its type:
my @nums = 1, 2, 3;
@nums.does(Iterator); #=> False
Well then, what does compose the Iterator
role? We know that Array
composes the Iterable
role and thus implements the iterator
method which
is supposed to return an Iterator
. Therefore, calling the iterator
method on
an iterable should provide us with an iterator. This method must be invoked on
an object instance instead of a type object so we'll use the
object instance @nums
of Array
from before:
my @nums = 1, 2, 3;
my $iter-nums = @nums.iterator;
$iter-nums.does(Iterator); #=> True
As we can see, calling the iterator
method on an iterable object gives us
back an Iterator
object. What can it be used for?
An iterable object only tells us that it can be iterated over
(e.g., with a for
loop). In hindsight, an iterable is an object capable of
returning an iterator but it doesn't specifically know how to move from one
value to the next. This is what the iterator object is for; it's an
object that returns a single value at a given time and then increments the
internal pointer to the next value. This action of moving from one value to
the next and letting us know when no more values can be consumed from the
collection underlies the process of iterating over each element of the
iterable object.
Flow control constructs such as the for
loop are able to iterate
through the data of iterable objects sequentially by calling the iterator
method on the objects. Usually the users don't need to tell iterables when
to behave as iterables because the method is invoked implicitly in the
appropriate context
-
by statements such as a
for
loop which calls.iterator
on the variable it precedes, and then run a block once for every item. -
when the created object is assigned to a
Positional
variable such as@
-sigiled variables.
From the documentation:
Users usually don't have to care about iterators, their usage is hidden behind iteration APIs such as
for @list { }
,map
,grep
,head
,tail
,skip
and list indexing with.[$idx]
.
my @nums = 1, 2, 3;
for @nums -> $num { say $num }
# OUTPUT: "123"
In the example above, the for
loop iterates over each element of the array
and the block is executed at each turn of the loop. However, could we consume
the iterable on our own, without a built-in loop? The answer is yes and
we just need to get the iterator from the iterable and then repeatedly call
the pull-one
method on the iterator ourselves. For example:
my @nums = 1, 2, 3;
my $iter-nums = @nums.iterator;
say $iter-nums.pull-one; #=> "1"
say $iter-nums.pull-one; #=> "2"
say $iter-nums.pull-one; #=> "3"
We've already consumed the three elements in the array. What happens if we
invoke the pull-one
method one more time?
my @nums = 1, 2, 3;
my $iter-nums = @nums.iterator;
say $iter-nums.pull-one; #=> "1"
say $iter-nums.pull-one; #=> "2"
say $iter-nums.pull-one; #=> "3"
say $iter-nums.pull-one.raku; #=> IterationEnd
As you can see the sentinel value IterationEnd
is returned after another
pull-one
invocation once the last value has been returned, thus to visit
the entire iterable there will always be one more invocation of pull-one
than there is data in the iterable. While it's the case that .pull-one
can be invoked on many iterators once they've been exhausted, the design
dictates that once an iterator has produced the IterationEnd
sentinel,
.pull-one
should not be called anymore. Any subsequent calls may produce
undefined results, or even throw an exception. On the one hand, this should
make iterators easier to implement. On the other hand, it can make them
more difficult to implement.
This behavior is unlike in other programming languages that raise their
respective exceptions once the iterable has been exhausted. For example,
Python throws a
StopIteration
exception
whenever an iterable is consumed.
In Raku, if we'd like to test if we've consumed all the elements in an iterable,
we must do an identity comparison of the sentinel value IterationEnd
(using =:=
) with the output of
the pull-one
method directly:
my @nums = 1, 2, 3;
my $iter-nums = @nums.iterator;
say $iter-nums.pull-one;
say $iter-nums.pull-one;
say $iter-nums.pull-one;
if $iter-nums.pull-one =:= IterationEnd {
say "All the iterable's elements have been consumed."
}
# OUTPUT: "123All the iterable's elements have been consumed."
Instead of comparing against pull-one
's return value directly, we could
bind its return value to a variable and compare it to IterationEnd
instead.
Note that an assignment won't work; just a binding will do. It's also worth
noting that the only defined use for the sentinel value is the identity
comparison and any other behavior is undefined and implementation dependent.
We already know how to check if we've consumed all the elements from an iterable
so we could emulate printing out an iterable's elements with a for
loop by
using a while
loop instead. For example:
my @nums = 1, 2, 3;
my $iter-nums = @nums.iterator;
loop {
my $item := $iter-nums.pull-one;
last if $item =:= IterationEnd;
say $item;
}
We loop indefinitely, printing out values along the way, until we
reach IterationEnd
in which case we exit the loop with last
. By this point,
we've consumed the iterable's elements and if we wish to do the same once more,
we'd need to get a new iterator. The takeaway is that Raku iterators
can’t be reset, meaning they return IterationEnd
once they’re exhausted.
As mentioned above, any subsequent calls to an exhausted iterator isn't part
of the contract and thus it may produce undefined results.
Regarding the relationship between iterables and iterators, the Raku documentation states:
While
Iterable
provides the high-level interface loops will be working with,Iterator
provides the lower-level functions that will be called in every iteration of the loop.
The summary for this section is that iterating as a protocol and as the action of visiting all items in a complex object or data structure is composed of two parts: 1) an iterable (the object being iterated over) and 2) the iterator (the pointer-like object that moves over the iterable).
In the sections below, we delve more deeply into the Iterable
and Iterator
roles, how to compose them into user-defined classes, and how to implement
the two main methods that are of interest to us, namely the iterator
and pull-one
methods, and that the composed classes must implement.
The primary purpose of the Iterable
role is to promise that an iterator
method is available in the object or data structure that composes it. As mentioned
previously, this method is used:
-
by statements such as a
for
loop which calls.iterator
on the variable it precedes, and then run a block once for every item. -
when the creator object is assigned to a
Positional
variable such as@
-sigiled variables.
A simple example of composing the Iterable
role into a class and implementing
the iterator
method looks as follows:
subset Strand of Str where { .match(/^^ <[ACGT]>+ $$/) && .chars %% 3 }
class DNA does Iterable {
has Strand $.chain;
method iterator {
$!chain.comb(3).iterator;
}
};
my $dna-chain := DNA.new(chain => 'ACGTACGTT');
for $dna-chain -> $codons {
say $codons.join('');
}
# OUTPUT: "ACGTACGTT"
Here we have the DNA
class that mixes the Iterable
role and offer a new
way of iterating over what is essentially a string constrained to the four
DNA letters, and whose length is a multiple of 3. In the looping part, the
for
actually hooks to the iterator
method and then the letters are
printed in groups of 3.
In this example, we bind the DNA
instance object to the variable
$dna-chain
. We could've assigned the newly created object to a Positional
variable in which case the iterator
would be called on the iterable object
and its consumed values would be assigned to the variable. For instance:
my @codons = DNA.new(chain => 'ACGTACGTT');
for @codons -> $codon {
say $codon.join('');
}
# OUTPUT: "ACGTACGTT"
Note that @codons
isn't a DNA
object; it's just a plain array that contains
the consumed values from the iterable object DNA.new(chain => 'ACGTACGTT')
.
What would happen if we assign a DNA
object directly to a scalar variable?
Could we still loop over that variable with the built-in for
loop?
my $chain = DNA.new(chain => 'ACGTACGTT');
.say for $chain;
# OUTPUT: "<anon|34>.new"
Well, that's not what we expected. What if we invoke the iterator
method
on the scalar variable?
my $chain = DNA.new(chain => 'ACGTACGTT');
.say for $chain.iterator;
# OUTPUT: "<anon|34>.new"
We get the same output. Previously we mentioned that the iterator
method
is invoked by statements such as for
loop and other iterating constructs
such as assigning to a Positional
variable. So the question is
when does for
call the iterator
method? From
raiph on StackOverflow:
As I understand, if the (single) argument to an iterating feature is a
Scalar
container, then it uses the value in the container and does not call.iterator
. Otherwise, it calls.iterator
on it, evaluating it first if it's an expression or routine call.
Hence in our example with my $chain = DNA.new(chain => 'ACGTACGTT');
,
we rightfully creates a Scalar
container. When we try to iterate over it
with for
, the scalar returns its only item, the object itself, and
no iterator
method is called on it. Why does binding instead of assignment
work?
my $binded-chain := DNA.new(chain => 'ACGTACGTT');
my $assign-chain = DNA.new(chain => 'ACGTACGTT');
say $binded-chain.VAR.WHAT; (DNA)
say $assign-chain.VAR.WHAT; (Scalar)
By binding, we bind the value directly to the variable and no container is
used to interface between the value and the variable. This isn't what happen
with assignment; instead of directly binding the value to the variable,
a container (a Scalar
object) is created for you and that is bound to
the lexpad entry of the variable. The takeaway is that you must be
mindful to what you assign your iterables to. More about
containers and
containers in Perl 6.
Nonetheless, we could still manually iterate over the DNA
object stored
in the scalar variable. It's just a matter of getting our hands on an
iterator by calling .iterator
on the object and then accessing each value
by pulling one with the pull-one
method. Thus:
my $chain = DNA.new(chain => 'ACGTACGTT');
my $iter = $chain.iterator;
loop {
my $codon := $iter.pull-one; # must bind for identity comparison
last if $codon =:= IterationEnd;
say $codon.join('');
}
# OUTPUT: "ACGTACGTT"
Certainly this isn't the most Raku-y code but it helps us to see that, even after
being containerized, the iterable's elements are still accessible, albeit
not through the elegance of a for
loop.
Lastly we know that the iterator
method is invoked when assigning to
a Positional
variable so we could postfix it with []
and let the
for
loop know that the scalar value is indeed iterable:
my $chain = DNA.new(chain => 'ACGTACGTT');
for $chain[] -> $codon {
say $codon.join('');
}
# OUTPUT: "ACGTACGTT"
As mentioned above, the Iterator
role provides the lower-level subroutines
that will be called in every iteration of the loop. One of these is the
pull-one
method, which either returns the next element from the collection,
or the sentinel value IterationEnd
if no more elements are available
(i.e., when the collection is exhausted/consumed). In addition to this method,
this role provides a few other methods to allow for a finer operation
of iteration in several contexts such as adding or eliminating items,
consuming them all at once, skipping over them to access other items, etc.
Except for .pull-one
which must be implemented in the composed class,
the Iterator
role provide default implementations, in terms of pull-one
,
for these methods.
A simple example of composing the Iterator
role into a class and implementing
the pull-one
method looks as follows:
# Countdown from a positive integer N to 0.
class CountDown does Iterator {
has $.current;
method new( $current where { $_ >= 0 } ) {
self.bless(:$current)
}
method pull-one( --> Mu ) {
my $result = $!current--;
if $result == -1 {
# return the sentinel value after the last value is consumed.
return IterationEnd
}
elsif $result <= -2 {
# calling .pull-one again after it returns IterationEnd
# is undefined so we choose to throw an exception.
die "Countdown already consumed."
}
else {
return $result;
}
}
}
sub loop-over( Iterable:D $sequence ) {
my $iterator = $sequence.iterator;
loop {
my $item := $iterator.pull-one;
last if $item =:= IterationEnd;
say $item;
}
}
loop-over Seq.new(CountDown.new(5));
# OUTPUT: "543210"
The CountDown
class composes the Iterator
role and implements a simple
pull-one
method that will iterate over a collection of positive descending
integers until it reaches zero. The call to pull-one
after all the collection's
elements are consumed returns the sentinel value IterationEnd
. Normally any calls
afterwards is undefined but here we choose to throw an exception. A for
loop won't
be able to iterate over CountDown
object since it doesn't implement the
Iterable
role. Therefore, we use the helper loop-over
subroutine to
do so. As you might've noticed, the subroutine's signature requires an
Iterable
so whenever we pass a CountDown
object to loop-over
we create a Seq
from the object in order to make it iterable as shown
in the example.
Even when we compose the Iterator
role and implement
the pull-one
method for finer control on how an object's values must
be looped over, we still might need compose the Iterable
role and
implement the iterator
method because this is what advertises to constructs
such as the for
loop that the object in question is iterable. Thus, we
can get rid of the helper subroutine and do away with creating an iterable
with Seq
by simply composing the Iterable
role into CountDown
as follows
(note that we're using simple logic inside the pull-one
method so we moved
from regular if/elsif/else
conditions to simpler statement modifier if
s):
# Countdown from a positive integer N to 0.
class CountDown does Iterable does Iterator {
has $.current;
method new( $current where { $_ >= 0 } ) {
self.bless(:$current)
}
method iterator() { self }
method pull-one( --> Mu ) {
my $result = $!current--;
return IterationEnd if $result == -1;
die "Countdown already consumed." if $result <= -2;
return $result if $result >= 0; # this `if` isn't necessary, only
# used to make things look balanced
}
}
for CountDown.new(5) -> $count {
say $count
}
# OUTPUT: "543210"
my @counts = CountDown.new(5);
say @counts.join(', ');
# OUTPUT: "5, 4, 3, 2, 1, 0"
There's not doubt this looks a lot simpler. As you might've noticed, we implement
the iterator
method from the Iterable
role by simply returning self
,
the object itself. The reason for this is that an iterating construct can
call the pull-one
method on it to access every element in turn. This is
possible because the object will be both Iterable
and an Iterator
.
The relationship between an iterable and an iterator is a close one, and the following bears some repetition:
An iterable describes a collection of items and advertises that these items can be iterated over, i.e., they can be visited one at a time until all of them are exhausted. An iterator is an entity which does what an iterable advertises can be done on it. More specifically, an iterator tracks which item to get at a given time, moves from the current to the next until the last one, and then stops.
An infinite iterator is an iterator capable of producing an infinite number of iterations. For example:
class CountUp does Iterable does Iterator {
has $!current = 0;
method iterator { self }
method pull-one( --> Mu ) {
my $val = $!current;
$!current += 1;
return $val;
}
}
The pull-one
method doesn't have a condition that must be met in order
to return IterationEnd
, thus the iterator continues forever.
Using a CountUp
object with a for
loop isn't an option; it won't ever
stop, at least not before it runs out of resources. However, we can still
consume values from the iterable object by pulling them out manually
with .pull-one
:
my $x = CountUp.new;
my $it = $x.iterator;
say $it.pull-one; #=> 0
say $it.pull-one; #=> 1
say $it.pull-one; #=> 2
...
say $it.pull-one; #=> 99
say $it.pull-one; #=> 100
...
This could continue forever and it shows that with an iterator that iterates
over an infinite set, we can have an infinite number of items without having
to store all of them in memory; we just need to produce them as we
need them. However, we don't need to bother implementing this ourselves since
Raku already offers elegant ways of producing iterable, lazy sequence of values
with the Seq
class.
s/This isn't what happen/This isn't what happens/
Re: However, we don't need to bother implementing this ourselves since Raku already offers elegant ways of producing iterable, lazy sequence of values with the Seq class.
I'm not sure that makes sense in this context. A lazy
Iterator
is just an instantiated Iterator that returnsTrue
with itsis-lazy
method. As long as Raku knows an Iterator is lazy, it won't try to be eager and hang because it tries to produce all possible values. So adding a:is enough to make an Iterator appear lazy, and thus prevent hanging.