Created
May 30, 2009 01:03
-
-
Save runpaint/120306 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
While writing some specifications for RubySpec, I've encountered an issue that | |
I'd appreciate clarification on. What is the expected behaviour of an iterator | |
if it is modified inside of the block it yields to? | |
For example, the following enters an infinite loop because it appends to the | |
Array being iterated over for each iteration. Here, the number of iterations | |
can be increased from within the block. | |
>> a = (1..10).to_a | |
=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] | |
>> a.each { |e| a << 0 } | |
# Infinite loop | |
Conversely, calling a.reject!(true) inside the iterator block truncates the | |
Array, halting iteration: | |
>> a = (1..10).to_a | |
=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] | |
>> a.each { |e| a.reject!{true}; p e } | |
1 | |
=> [] | |
Prepending to an Array, increases the number of iterations: | |
>> @prepend = true | |
=> true | |
>> a=(1..10).to_a | |
=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] | |
>> a.each {|e| a.unshift(0) if @prepend; @prepend = false; p e} | |
1 | |
1 | |
2 | |
3 | |
4 | |
5 | |
6 | |
7 | |
8 | |
9 | |
10 | |
=> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] | |
When iterating over an Array, deleting the element that is being yielded | |
reduces the number of iterations:: | |
>> a=(1..10).to_a | |
=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] | |
>> a.each { |e| a.delete(e); p e } | |
1 | |
3 | |
5 | |
7 | |
9 | |
=> [2, 4, 6, 8, 10] | |
However, with a Hash: | |
>> h={:foo=>:bar, :glark => :quark} | |
=> {:foo=>:bar, :glark=>:quark} | |
>> h.each_key{|k| h.delete(k); p k} | |
:foo | |
:glark | |
=> {} | |
In the following example assigning to a[-1] changes the final element yielded: | |
>> a=(1..10).to_a | |
=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] | |
>> a.each {|e| a[-1] = 0; p e } | |
1 | |
2 | |
3 | |
4 | |
5 | |
6 | |
7 | |
8 | |
9 | |
0 | |
=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 0] | |
In this example, assigning to the object being iterated over while inside the | |
block doesn't affect the iteration. | |
>> a = (1..10).to_a | |
=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] | |
>> a.each { |e| a = 0; p e } | |
1 | |
2 | |
3 | |
4 | |
5 | |
6 | |
7 | |
8 | |
9 | |
10 | |
=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] | |
Similarly, changing the 0th element of the Array doesn't have any effect: | |
>> a=(1..10).to_a | |
=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] | |
>> a.each { |e| a[0] = 0; p e } | |
1 | |
2 | |
3 | |
4 | |
5 | |
6 | |
7 | |
8 | |
9 | |
10 | |
=> [0, 2, 3, 4, 5, 6, 7, 8, 9, 10] | |
The principles I derive from this are: | |
The abstraction of iterating over Arrays as collections of objects with #each | |
is broken somewhat because it is revealed that its the Array's index that is | |
used in the iteration. So, if one assigns to an index of the Array that has | |
already been iterated over, the new object will not be yielded to the block. | |
However, the opposite case will yield the new object. The reliance on indicies | |
is most apparent when considering the #unshift example, because it results in | |
the original 0th element being yielded twice, and the new 0th element not | |
being yielded at all. | |
In both cases, truncating the iterator reduces the number of iterations, and | |
appending to the iterator increases the same. In both cases, changing the type | |
of the iterator from inside the block has no effect on the iteration. | |
However, these principles don't hold when considering other types of | |
iterators. For example, using String#each_char, and appending to the String | |
from inside the block, does not affect the iteration whatsoever. Similarly, | |
assigning to an index of the String has no effect on the iteration. | |
There are clearly more complex examples, but I think these are enough to | |
explain the situation. I'm wondering whether there are any general principles | |
at work here. What's the Right Thing to do in cases where the iterator is | |
modified from inside the block it yields to? Or is the answer simply that the | |
current behaviour is by definition correct? :-) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment