Skip to content

Instantly share code, notes, and snippets.

@runpaint
Created May 30, 2009 01:03
Show Gist options
  • Save runpaint/120306 to your computer and use it in GitHub Desktop.
Save runpaint/120306 to your computer and use it in GitHub Desktop.
While writing some specifications for RubySpec, I've encountered an issue that
I'd appreciate clarification on. What is the expected behaviour of an iterator
if it is modified inside of the block it yields to?
For example, the following enters an infinite loop because it appends to the
Array being iterated over for each iteration. Here, the number of iterations
can be increased from within the block.
>> a = (1..10).to_a
=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>> a.each { |e| a << 0 }
# Infinite loop
Conversely, calling a.reject!(true) inside the iterator block truncates the
Array, halting iteration:
>> a = (1..10).to_a
=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>> a.each { |e| a.reject!{true}; p e }
1
=> []
Prepending to an Array, increases the number of iterations:
>> @prepend = true
=> true
>> a=(1..10).to_a
=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>> a.each {|e| a.unshift(0) if @prepend; @prepend = false; p e}
1
1
2
3
4
5
6
7
8
9
10
=> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
When iterating over an Array, deleting the element that is being yielded
reduces the number of iterations::
>> a=(1..10).to_a
=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>> a.each { |e| a.delete(e); p e }
1
3
5
7
9
=> [2, 4, 6, 8, 10]
However, with a Hash:
>> h={:foo=>:bar, :glark => :quark}
=> {:foo=>:bar, :glark=>:quark}
>> h.each_key{|k| h.delete(k); p k}
:foo
:glark
=> {}
In the following example assigning to a[-1] changes the final element yielded:
>> a=(1..10).to_a
=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>> a.each {|e| a[-1] = 0; p e }
1
2
3
4
5
6
7
8
9
0
=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
In this example, assigning to the object being iterated over while inside the
block doesn't affect the iteration.
>> a = (1..10).to_a
=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>> a.each { |e| a = 0; p e }
1
2
3
4
5
6
7
8
9
10
=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Similarly, changing the 0th element of the Array doesn't have any effect:
>> a=(1..10).to_a
=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>> a.each { |e| a[0] = 0; p e }
1
2
3
4
5
6
7
8
9
10
=> [0, 2, 3, 4, 5, 6, 7, 8, 9, 10]
The principles I derive from this are:
The abstraction of iterating over Arrays as collections of objects with #each
is broken somewhat because it is revealed that its the Array's index that is
used in the iteration. So, if one assigns to an index of the Array that has
already been iterated over, the new object will not be yielded to the block.
However, the opposite case will yield the new object. The reliance on indicies
is most apparent when considering the #unshift example, because it results in
the original 0th element being yielded twice, and the new 0th element not
being yielded at all.
In both cases, truncating the iterator reduces the number of iterations, and
appending to the iterator increases the same. In both cases, changing the type
of the iterator from inside the block has no effect on the iteration.
However, these principles don't hold when considering other types of
iterators. For example, using String#each_char, and appending to the String
from inside the block, does not affect the iteration whatsoever. Similarly,
assigning to an index of the String has no effect on the iteration.
There are clearly more complex examples, but I think these are enough to
explain the situation. I'm wondering whether there are any general principles
at work here. What's the Right Thing to do in cases where the iterator is
modified from inside the block it yields to? Or is the answer simply that the
current behaviour is by definition correct? :-)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment