Enumerable
. Debatably one of, if not the, most powerful features in Ruby. As a majority of your time in programming is dealing with collections of items it's no surprise how frequently you'll see it used.
Foundational
Some knowledge required of functions in Ruby. This post focuses on foundational and fundamental knowledge for Ruby programmers.
Prerequisite Reading:
- Understanding Ruby - Blocks, Procs, and Lambdas
- Understanding Ruby - to_proc and Function Interfaces
- Understanding Ruby - Triple Equals
- Understanding Ruby - Comparable
Enumerable
is an interface module that contains several methods for working with collections. Many Ruby classes implement the Enumerable
interface that look like collections. Chances are if it has an each
method it supports Enumerable
, and because of that it's quite ubiquitous in Ruby.
So how are we going to cover such a large piece of the language? Categorically, and of course after we show how you can implement one of your own
Note: This idea was partially inspired by Lamar Burdette's recent work on Ruby documentation, but takes its own direction.
To start with, how do we implement Enumerable
ourselves? Via an each
method and including the module, much like Comparable
from the last post. We'll be reexploring our Card
class from that article as well as making a Hand
to contain those cards.
Let's start with our Card
class from last time:
class Card
include Comparable
SUITS = %w(S H D C).freeze
RANKS = %w(2 3 4 5 6 7 8 9 10 J Q K A).freeze
RANKS_SCORES = RANKS.each_with_index.to_h
include Comparable
attr_reader :suit, :rank
def initialize(suit, rank)
@suit = suit
@rank = rank
end
def self.from_str(s) = new(s[0], s[1..])
def to_s() = "#{@suit}#{@rank}"
def <=>(other) = precedence <=> other.precedence
end
There's one new method here for convenience that gives us a Card
from a String
, letting us do this:
Card.from_str('SA')
That gets to be handy when we want an entire hand in a second.
Now let's take a look at a Hand class that might contain these cards:
class Hand
include Enumerable
attr_reader :cards
def initialize(*cards)
@cards = cards.sort
end
def self.from_str(s) = new(*s.split(/[ ,]+/).map { Card.from_str(_1) })
def to_s() = @cards.map(&:to_s).join(', ')
def each(&fn) = @cards.each { |card| fn.call(card) }
end
Starting with Enumerable
features, we define an each
method at the bottom which takes a Block Function and calls it with each card from the cards in our Hand
.
Next we have a utility function like Card
had which allows us to make a Hand
from a String
, because otherwise that's a lot of typing:
royal_flush = Hand.from_str('S10, SJ, SQ, SK, SA')
With the above Enumerable code we can now use any of the Enumerable
methods against it:
royal_flush.reject { |c| c <= Card.from_str('SQ') }.join(', ')
# => "SK, SA"
Nifty! Now with that down let's take a look at all of the shiny fun things in Enumerable
. We'll be using more generic examples from here on out.
Ruby has many aliases, like collect
is an alias for map
. As I prefer map
I will be using that for examples. When you see a /
in the header in other sections, the first item will be the preference I will draw from, but you could use the other name to the same effect.
You might see #method_name
or .method_name
mentioned in Ruby on occasion. This means Instance Method and Class Method respectively. You might also see it like Enumerable#map
, which means map
is an Instance Method of Enumerable
.
map
expresses the idea of transforming a collection using a function, or by using the english word expressing a way to get from point A to point B. Amusingly in some functional programming languages this is expressed A -> B
, wherein ->
is the function.
For us it might be used something like this:
[1, 2, 3].map { |v| v * 2 }
# => [2, 4, 6]
In which the function is to double every element of a collection, giving us back a brand new collection in which all elements are doubles of the original.
Using the syntax for Symbol#to_proc
we can also use map
to extract values out of objects:
people.map(&:name)
If we had an Array
of people we could use map
to get all of their names using this shorthand.
map
is great for transforming collections and pulling things out of a collection.
flat_map
will both map
a collection and afterwards flatten
it:
hands = [
Hand.from_str('S2, S3, S4'),
Hand.from_str('S3, S4, S5'),
Hand.from_str('S4, S5')
]
hands.flat_map(&:cards).map(&:to_s).join(', ')
# => "S2, S3, S4, S3, S4, S5, S4, S5"
flat_map
is great when you want to extract something like an Array
from items and combine them all into one Array
. It's also great for generating products, but remember that Ruby also has the Array#product
method which works better unless you have something more involved to do.
It's for when you want one Array
rather than Array
s of Array
s.
filter_map
is interesting in that it combines the idea of filter
and the idea of map
. If the function passed to filter_map
returns something falsy (false
or nil
) it won't be present in the returned collection:
[1, 2, 3].filter_map { |v| v * 2 if v.odd? }
# => [2, 6]
In this case 2
will be ignored. filter_map
is great if you find yourself using map
, returning nil
, and using compact
at the end to drop nil
values.
This method is great when you want to both filter down a collection and do something with those values.
all?
is a predicate method, meaning it's boolean or truthy in nature. For all?
it checks all items in a collection meet a certain condition:
[1, 2, 3].all? { |v| v.even? }
# => false
We can also use shorthand here:
[1, 2, 3].all?(&:even?)
...and interestingly it also accepts a pattern, or rather something that responds to ===
:
[1, 2, 3].all?(Numeric)
# => true
all?
will also stop searching if it finds any element which does not match the condition.
An interesting behavior is that it will return true
on empty collections:
[].all?
# => true
all?
is great when you want to check if all of a collections items meet a condition, or perhaps many.
any?
is very similar to all?
except in that it checks if any of the items in a collection match the condition:
[1, 'a', :b].any?(Numeric)
Interestingly as soon as it finds a value that matches it will stop searching. After all, why bother? It found what it wanted, and it's way more efficient to say return true
rather than go through the rest.
With an empty collection any?
will return false
as there are no elements in it:
[].any?
# => false
any?
is great for checking if anything in a collection matches a condition.
none?
can be thought of as the opposite of all?
, or maybe even as not any?
. It checks that none of the elements in a collection match a certain condition:
[1, 'a', :b].none?(Float)
# => true
none?
will return true
on an empty collection:
[].none?
# => true
Be careful, as this behavior is very similar to all?
which also returns true
.
none?
can be great for ensuring that nothing in a collection matches a negative set of rules, like simple validations.
one?
is very much like any?
except in it will search the entire collection to make sure there's one and only one element that matches the condition:
[1, :a, 2].one?(Symbol)
# => true
[1, :a, 2].one?(Numeric)
# => true
It has some interesting behavior when used without an argument on empty or single element collections:
[].one?
# => false
[1].one?
# => true
one?
is great when you want to ensure one and only one element of a collection matches a condition. I have not quite had a chance to use this myself, but can see how it would be handy.
include?
checks if a collection includes a value:
[1, 2, 3].include?(2)
# => true
It has an alias in member?
.
include?
will compare all elements via ==
to see if any match the one we're looking for.
find
is how you find one element in a collection:
[1, 2, 3].find { |v| v == 2 }
# => 2
[1, 2, 3].find { |v| v == 5 }
# => nil
Oddly it takes a single argument, something that responds to call
, as a default:
[1, 2, 3].find(-> { 1 }) { |v| v == 5 }
# => 1
I honestly do not understand this myself as you cannot give it a value like this:
[1, 2, 3].find(1) { |v| v == 5 }
# NoMethodError (undefined method `call' for 1:Integer)
There is currently a bug tracker issue open against this, but it hasn't seen updates in a fair amount of time not including my recent question on it.
find
is useful for finding a single value in a collection and returning it as soon as it finds it, rather than using something like select.first
which would iterate all elements.
find_index
is very similar to find
except that it finds the index of the item rather than returning the actual item:
[1, 2, 3].find_index { |v| v == 2 }
# => 1
Interestingly it takes an argument rather than a Block Function for a value to search for:
[1, 2, 3].find_index(3)
# => 2
...which makes a bit more sense than a default argument like in the case of find
, but to change those would break all types of potential code.
I have not found a direct use for find_index
at this point, and cases where I would use it I tend to reach for slicing and partitioning methods instead.
select
is a method with a lot of aliases in find_all
and filter
. If you come from Javascript filter
might be more comfortable, and with the introduction of filter_map
it may see more popularity. select
is more common in general usage.
select
is used to get all elements in a collection that match a condition:
[1, 2, 3, 4, 5].select(&:even?)
# => [2, 4]
Currently it uses a Block Function to check each element.
select
is typically used and great for filtering lists by a positive condition.
reject
, however, is great for negative conditions like everything except Numeric
entries:
[1, 'a', 2, :b, 3, []].reject { |v| v.is_a?(Numeric) }
# => ["a", :b, []]
Though in this particular case I would likely use grep_v
instead which we'll cover in a moment. grep
right below will have some additional insights on this distinction.
Often times Ruby methods will have a dual that does the opposite. select
and reject
, all?
and none?
, the list goes on. Chances are there's an opposite method out there.
reject
has many of the same uses as select
except that it inverts the condition and instead rejects elements which match a condition.
grep
is interesting in that it's based on Unix's grep
command, but in Ruby it takes something that responds to ===
as an argument:
[1, :a, 2, :b].grep(Symbol)
# => [:a, :b]
It behaves similarly to select
, and there are tickets out to consider adding the ===
behavior to select
, similarly with reject
and grep_v
.
Where it differs is that its block does something different:
[1, :a, 2, :b].grep(Numeric) { |v| v + 1 }
# => [2, 3]
It acts very much like map
for any elements which matched the condition.
grep
behaves mildly similarly to filter_map
except that every element in the block has already been filtered via ===
. When you need more power for conditional checking if an element belongs in the new list use filter_map
, otherwise grep
makes a lot of sense.
grep_v
is the dual of grep
, similar to select
and reject
. grep_v
behaves similarly to reject
except it uses grep
's style:
[1, :a, 2, :b].grep_v(Symbol)
# => [1, 2]
[1, :a, 2, :b].grep_v(Symbol) { |v| v + 1 }
# => [2, 3]
Just as with reject
it makes sense in cases where you want the opposite data from grep
but still want the same condition.
TODO
TODO
uniq
will get all unique items in a collection:
[1, 2, 3, 1, 1, 2].uniq
# => [1, 2, 3]
It also takes a block to let you decide exactly what criteria you want the new collection to be unique by:
(1..10).uniq { |v| v % 5 }
# => [1, 2, 3, 4, 5]
Which can be very useful for unique sizes, names, or other criteria. In the above example we're doing something a bit unique in searching for remainders from modulo which can be very useful in certain algorithmic problems.
uniq
is great when you want to get a unique collection of elements, but if you find yourself using uniq
a lot you may want to consider using a Set
instead, which we'll cover in a later article.
All of these methods require that the underlying class implements <=>
, or you will see errors.
sort
will sort collections:
[4, 2, 1].sort
# => [1, 2, 4]
Remembering back to Comparable
it can also take a Block Function with a comparator Rocketship Operator (<=>
):
[1, 2, 4].sort { |a, b| b <=> a }
# => [4, 2, 1]
It follows the same rules of deciding precedence by the return of the comparator, and that has to be an Integer
of one of -1, 0, 1
in value.
sort_by
is interesting in that it implements the comparator behind the scenes and uses a single attribute instead:
%w(a fresh lively lemur jumps over a tea kettle).sort_by(&:length)
# => ["a", "a", "tea", "over", "jumps", "fresh", "lemur", "lively", "kettle"]
The docs do note that this is much slower, because nothing nice ever comes for free, and why knowing what Comparable
is is quite useful to define how sort
behaves by default.
All of these methods require that the underlying class implements <=>
, or you will see errors.
max
gets the item with the maximum value in a collection:
[1, 2, 3].max
# => 3
When provided a Block Function it works much like sort
and its comparator:
[1, 2, 3].max { |a, b| b <=> a }
# => 1
Though if you want to do something like that it probably makes more sense to use min
.
It can also take a number to get multiple max numbers:
[1, 2, 3, 4, 5].max(3)
# => [5, 4, 3]
Those numbers will be in order of maximum to least so, and this can also take a Block Function the same as above.
max_by
is very much like sort_by
except in that it will give the maximum elements rather than sorting:
%w(a fresh lively lemur jumps over a tea kettle).max_by(&:length)
# => "lively"
This can also take a number:
%w(a fresh lively lemur jumps over a tea kettle).max_by(2, &:length)
# => => ["kettle", "lively"]
sort
mentions this, but elements that are both 0
when compared may come back in unexpected order, but in this case the order that they were found in reverse.
min
is the inverse of max
, and works much the same way:
[1, 2, 3].min
# => 1
See max
for accepting a number and a Block Function
The same can be said of min_by
versus max_by
:
%w(a fresh lively lemur jumps over a tea kettle).min_by(&:length)
# => "a"
See max_by
for more options.
minmax
, however, is a bit different. It returns both the minimum and maximum element in a list as a pair:
[1, 2, 3, 4, 5].minmax
# => [1, 5]
As with the min
methods these behave the same.
minmax_by
follows the same conventions as well:
%w(a fresh lively lemur jumps over a tea kettle).minmax_by(&:length)
# => ["a", "lively"]
See max_by
for more options.
count
counts the number of elements in a collection that match a condition. When supplied no arguments it returns how many elements are in the collection:
[1, 2, 3, 4, 5].count
# => 5
When given a number it searches for that number and gives a count of how many occurrences it found:
[1, 1, 2, 2, 3, 3, 3].count(3)
# => 3
...and finally when given a Block Function it returns back how many elements matched a condition inside of it:
[1, 1, 2, 2, 3, 3, 3].count(&:odd?)
# => 5
count
is one of those methods which comes in handy all the time, and I frequently use it along with the other two methods in this section.
sum
gets the sum of the elements in a collection:
[1, 2, 3].sum
# => 6
It also takes an initial value to start summing from:
[1, 2, 3].sum(1)
# => 7
It can also take a Block Function to define how to sum each element:
[1, 2, 3].sum { |v| v * 2 }
# => 12
Do note that's not a product, we'd need reduce
in a moment for that.
tally
is admittedly a point of pride for me as me and a group of friends at RailsCamp had a hand in naming it.
It used to be that you had to do this to get counts indexed by a key:
%w(a fresh lively lemur jumps over a tea kettle)
.group_by(&:itself)
.map { |k, vs| [k, vs.size] }
.to_h
# => {"a"=>2, "fresh"=>1, "lively"=>1, "lemur"=>1, "jumps"=>1, "over"=>1, "tea"=>1, "kettle"=>1}
# or
%w(a fresh lively lemur jumps over a tea kettle)
.each_with_object(Hash.new(0)) { |v, h| h[v] += 1 }
# => {"a"=>2, "fresh"=>1, "lively"=>1, "lemur"=>1, "jumps"=>1, "over"=>1, "tea"=>1, "kettle"=>1}
Now with tally
you can do this:
%w(a fresh lively lemur jumps over a tea kettle).tally
# => {"a"=>2, "fresh"=>1, "lively"=>1, "lemur"=>1, "jumps"=>1, "over"=>1, "tea"=>1, "kettle"=>1}
That's much more pleasant. Consider combining it with map
to do even more interesting things:
%w(a fresh lively lemur jumps over a tea kettle)
.map(&:size)
.tally
# => {1=>2, 5=>3, 6=>2, 4=>1, 3=>1}
tally
is one of my most used functions next to map
, filter
, and reduce
. It's really handy to get a quick look at an overview of the data you're looking at in a concise way.
TODO
TODO
TODO
TODO
TODO
TODO
TODO
TODO
TODO
TODO
TODO
TODO
TODO
zip
allows us to combine two or more collections into one:
a = [1, 2, 3]
b = [2, 3, 4]
c = [3, 4, 5]
a.zip(b, c)
# => [[1, 2, 3], [2, 3, 4], [3, 4, 5]]
zip
can be useful for merging multiple collections into one, especially when you have things like keys and values as separate variables you need to put together.
It can also take a Block Function which specifies how to zip values:
a = [1, 2, 3]
b = [2, 3, 4]
c = [3, 4, 5]
a.zip(b, c) { |x, y, z| [z, y - x] }
# => nil
Oddly this returns nil
and you have to use an outside array to capture these values. I cannot say I understand this as I might have expected this to behave like map
, but it is as it is. Given that I would suggest avoiding this syntax, as it may be confusing.
first
allows you to get the first few items of a collection:
[1, 2, 3, 4, 5].first(3)
# => [1, 2, 3]
...or if you use it with one argument it returns the first element:
[1, 2, 3, 4, 5].first
# => 1
Interestingly if you want the first element but still want an Array
returned there's no rule against using 1
for the number of elements:
[1, 2, 3, 4, 5].first(1)
# => [1]
This has been handy for me in the past.
first
is used, as the name suggests, to get the first elements out of a collection. Array
itself also implements last
, being first
s dual method, but that's not Enumerable
specifically.
drop
is an interesting inversion of first
in that it gives you back a new Array
without the first few elements:
[1, 2, 3, 4, 5].drop(3)
# => [4, 5]
drop
can be handy if you want to ignore the first few elements of a collection.
On the surface each_entry
looks like an alias for each
. Almost, but not quite:
class Foo
include Enumerable
def each
yield 1
yield 1, 2
yield
end
end
Foo.new.each_entry { |o| p o }
# STDOUT: 1
# STDOUT: [1, 2]
# STDOUT: nil
# => Foo<>
Looking at the example on the doc site it's doing a few different things:
Foo.new.each { |o| p o }
# STDOUT: 1
# STDOUT: 1
# STDOUT: nil
# => nil
Notice two things. The second yield only has one element now, and the return value is nil
rather than the calling object Foo
. nil
because the original implementation of each
for Array
returns the base object.
I have not found a use for each_entry
, but will endeavor to find what it might be used for. In the mean time I would encourage the use of each
instead.
each_with_index
, as the name implies, is each
with an index along for the ride:
[1, 2, 3].each_with_index { |v, i| p v + i }
# STDOUT: 1
# STDOUT: 3
# STDOUT: 5
# => [1, 2, 3]
While that can be useful on its own it returns an Enumerator
when not given a Block Function, meaning we can chain onto it:
[1, 2, 3].each_with_index.map { |v, i| v + i }
# => [1, 3, 5]
each_with_index
is useful when we want to iterate along with the index, especially if we need that index for something.
reverse_each
iterates in the reverse order:
[1, 2, 3].reverse_each { |v| p v }
# STDOUT: 3
# STDOUT: 2
# STDOUT: 1
# => [1, 2, 3]
Since it returns an Enumerator
like most of these methods do should you not pass a Block Function, you can also chain it:
[1, 2, 3].reverse_each.map { |v| v * 2 }
# => [6, 4, 2]
reverse_each
is useful when you want to go backwards, and there will be cases where you want to do that.
to_a
will coerce a collection into an Array
. This isn't very useful for an Array
, sure, but remember that other collection types have Enumerable
like Hash
and Set
:
require 'set'
Set[1, 2, 3].to_a
# => [1, 2, 3]
{ a: 1, b: 2, c: 3 }.to_a
# => [[:a, 1], [:b, 2], [:c, 3]]
Its use is very much "what's on the box". If you need something to be an Array
use to_a
to_h
is very similar, it coerces a collection to a Hash
:
[[:a, 1], [:b, 2], [:c, 3]].to_h
# => [[:a, 1], [:b, 2], [:c, 3]]
One interesting thing you can do is to use something like each_with
methods to do nifty things:
%i(a b c).each_with_index.to_h
# => {:a=>0, :b=>1, :c=>2}
%i(a b c).each_with_object(:default).to_h
# => {:a=>:default, :b=>:default, :c=>:default}
While that alone is interesting, there's a pattern you may recognize from Ruby in days past:
%i(a b c).map { |v| [v, 0] }.to_h
# => {:a=>0, :b=>0, :c=>0}
# Remember, `map` returns an `Array`
{ a: 1, b: 2, c: 3 }.map { |k, v| [k, v + 1] }.to_h
# => {:a=>2, :b=>3, :c=>4}
Guess what else to_h
does? It takes a Block Function:
%i(a b c).to_h { |v| [v, 0] }
# => {:a=>0, :b=>0, :c=>0}
{ a: 1, b: 2, c: 3 }.to_h { |k, v| [k, v + 1] }
# => {:a=>2, :b=>3, :c=>4}
...which is pretty nifty. Granted Hash
has transform_values
now as well, which is probably a better idea for that specific transformation, but it gets the point across.
to_h
is great, much like to_a
, when you need a Hash
. It has some more power behind it for getting there, and a lot of neat things you can do with it as a result.
Now lazy
, that's a subject for an entirely separate post, and that's just what we're going to do here. Rest assured that post will be coming out soon as well, but for now you can just call me.lazy
.
TODO
Want to keep up to date on what I'm writing and working on? [Take a look at my new newsletter: The Lapidary Lemur