Skip to content

Instantly share code, notes, and snippets.

@dsyer
Created March 19, 2012 13:28
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save dsyer/2112117 to your computer and use it in GitHub Desktop.
Save dsyer/2112117 to your computer and use it in GitHub Desktop.
Article analysing and simplifying the use of EventMachine and Fibers
*~
#*
.#*
scratch*
*.gem
.bundle
Gemfile.lock
pkg/*
require './fiber_helper'
@async = false
@waiting = []
@callbacks = []
def io(id)
@waiting << id
block = proc do |id|
sleep 0.1
puts "IO start: #{id}"
sleep 1
puts "IO done: #{id}"
@waiting.delete(id)
end
if @async
Thread.new do
block.call(id)
end
else
block.call(id)
end
end
def run
count = 0
yield if block_given?
until @callbacks.empty? && @waiting.empty?
count += 1
puts "Tick: #{count}"
callbacks = @callbacks.dup
@callbacks = []
callbacks.each do |callback|
result = callback.call(count) if callback.respond_to?(:call)
@callbacks(callback) if result.nil?
end
sleep 0.2
puts "Remaining: #{@callbacks.length} #{@waiting.length}"
end
end
run {
@async = true
Fiber.new {
count = 1
puts "Result: #{io(count)}"
count += 1
puts "Result: #{io(count)}"
count += 1
puts "Result: #{io(count)}"
count += 1
}.resume
}
require 'fiber'
module Kernel
alias :orig_puts :puts
def puts(*args)
prefix = "#{Fiber.current} - #{caller(1)[0]} - "
out = args.dup
out.collect! { |arg| prefix + "#{arg}" }
orig_puts *out
end
end
class Fiber
class << self
alias :orig_yield :yield
def yield(*args)
orig_puts "#{Fiber.current} - #{caller(1)[0]} - Yielding: #{args}"
result = orig_yield *args
orig_puts "#{Fiber.current} - #{caller(1)[0]} - Yielded: #{args}"
result
end
end
alias :orig_resume :resume
def resume(*args)
orig_puts "#{Fiber.current} - #{caller(1)[0]} - Resuming: #{args}"
result = orig_resume *args
orig_puts "#{Fiber.current} - #{caller(1)[0]} - Resumed: #{args}"
result
end
end
require 'fiber'
@callbacks = []
def run
count = 0
yield if block_given?
until @callbacks.empty?
count += 1
puts "Tick: #{count}"
callbacks = @callbacks.dup
@callbacks = []
callbacks.each do |callback|
callback.call(count) if callback.respond_to?(:call)
end
sleep 0.5
puts "Remaining: #{@callbacks.length}"
end
end
def one(num)
puts "One processing: #{num}"
num
end
def two(num)
puts "Two processing: #{num}"
num
end
def three(num)
puts "Three processing: #{num}"
num
end
def next_tick
@callbacks << lambda { |x| yield x }
end
run {
next_tick do |num|
puts "One: #{one(num)}"
end
next_tick do |num|
puts "Two: #{two(num)}"
end
next_tick do |num|
puts "Three: #{three(num)}"
end
}
run {
puts "Hello World"
}
run {
next_tick do |num|
puts "One: #{one(num)}"
next_tick do |num|
puts "Two: #{two(num)}"
next_tick do |num|
puts "Three: #{three(num)}"
end
end
end
}
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="content-type" content="text/html;charset=utf-8">
<title>reactor.rb</title>
<link rel="stylesheet" href="http://jashkenas.github.com/docco/resources/docco.css">
</head>
<body>
<div id='container'>
<div id="background"></div>
<table cellspacing=0 cellpadding=0>
<thead>
<tr>
<th class=docs><h1>reactor.rb</h1></th>
<th class=code></th>
</tr>
</thead>
<tbody>
<tr id='section-1'>
<td class=docs>
<div class="pilwrap">
<a class="pilcrow" href="#section-1">&#182;</a>
</div>
<p>It is easy to find examples of code using <a href="https://github.com/eventmachine/eventmachine">Event Machine</a> with
<a href="http://www.ruby-doc.org/core-1.9.3/Fiber.html">Fibers</a> in Ruby web apps (e.g. <a href="http://www.igvita.com/2009/05/13/fibers-cooperative-scheduling-in-ruby">see this article by Ilya
Grigorik</a>, or the <a href="https://github.com/igrigorik/em-http-request/blob/master/examples/fibered-http.rb">fibered example from
em-http-request</a>). People do this a lot to get the benefit of
scalability that comes from using Event Machine in an I/O intensive
application, without the ugly programming model that you get with lots
of nested callbacks. See below for some examples of what I mean by
that.</p>
<p>Event Machine is a reactor loop. It spins round and round waiting for
asynchronous I/O to update or complete. It is often used when there
is a lot of database traffic to and from a web application, or
increasingly these days, if you need to access web-based APIs from
inside application code. The (huge) benefit is that many independent
requests to these external services can be handled concurrently.</p>
<p>There are special asynchronous versions of, in particular, network I/O
that you need to use in conjunction with Event Machine to get the
benefit of the reactor. These are provided for you nicely wrapped in
a load of &quot;protocol&quot; libraries, e.g. <a href="https://github.com/eventmachine/em-http-request">Event Machine Http
Request</a>. The aim of
this article is to strip away the complications of the Event Machine
and helper APIs and just look at the bare bones of what is happening,
and how a reactor loop interacts with Fibers.</p>
<p>Fibers are a Ruby language feature. One way to think of them is that
they are like threads which you schedule yourself, instead of letting
the virtual machine handle scheduling pre-emptively. They can be used
in combination with the reactor in Event Machine to give the illusion
of serialization - calling methods in the order that you need them.
For the small overhead of using the Fiber API you can order the calls
to the asynchronous APIs naturally, and not worry about the fact that
they do not return their results immediately.</p>
</td>
<td class=code>
<div class='highlight'><pre></pre></div>
</td>
</tr>
<tr id='section-2'>
<td class=docs>
<div class="pilwrap">
<a class="pilcrow" href="#section-2">&#182;</a>
</div>
<p>First we pull in the fiber library and enhance it in a helper to
print out some extra debugging information. Replace this with
<code>require &#39;fiber&#39;</code> if you don&#39;t have the helper file.</p>
</td>
<td class=code>
<div class='highlight'><pre><span class="nb">require</span> <span class="s1">&#39;./fiber_helper&#39;</span></pre></div>
</td>
</tr>
<tr id='section-3'>
<td class=docs>
<div class="pilwrap">
<a class="pilcrow" href="#section-3">&#182;</a>
</div>
<p>Simple assert method (keeps dependencies to a minimum):</p>
</td>
<td class=code>
<div class='highlight'><pre><span class="k">def</span> <span class="nf">assert</span>
<span class="k">raise</span> <span class="s2">&quot;Assertion failed !&quot;</span> <span class="k">unless</span> <span class="k">yield</span>
<span class="k">end</span></pre></div>
</td>
</tr>
<tr id='section-4'>
<td class=docs>
<div class="pilwrap">
<a class="pilcrow" href="#section-4">&#182;</a>
</div>
<p>This is an array of callbacks for the reactor to run:</p>
<p>When it runs out of stuff to do it will terminate.</p>
</td>
<td class=code>
<div class='highlight'><pre><span class="vi">@callbacks</span> <span class="o">=</span> <span class="o">[]</span></pre></div>
</td>
</tr>
<tr id='section-5'>
<td class=docs>
<div class="pilwrap">
<a class="pilcrow" href="#section-5">&#182;</a>
</div>
<p>This is just a running counter of the number of iterations of the
reactor loop. It is passed into the callbacks, so if they want to
they can use it to &quot;time&quot; their work.</p>
</td>
<td class=code>
<div class='highlight'><pre><span class="vi">@count</span> <span class="o">=</span> <span class="mi">0</span></pre></div>
</td>
</tr>
<tr id='section-6'>
<td class=docs>
<div class="pilwrap">
<a class="pilcrow" href="#section-6">&#182;</a>
</div>
<p>This is the definition of the reactor. It is one method that accepts
a block to run, like a simplified version of
<a href="http://eventmachine.rubyforge.org/EventMachine.html#M000461">EM.run</a>. If the block schedules more work by
appending callbacks to <code>@callbacks</code> then the loop starts again, if
those callbacks in turn create more work, then the work will be
scheduled for the next iteration.</p>
<p>This simulates the Event Machine reactor loop, in the sense that
everything you pass in is executed asynchronously (just like in Event
Machine), but also simplifies the reactor because the work is always
executed in the next tick, instead of possibly having to wait for I/O
to complete. This makes it easy to reason about without introducing
any additional dependencies that might complicate things. You should
be able to get the same results with <code>EM.run</code>.</p>
</td>
<td class=code>
<div class='highlight'><pre><span class="k">def</span> <span class="nf">run</span></pre></div>
</td>
</tr>
<tr id='section-7'>
<td class=docs>
<div class="pilwrap">
<a class="pilcrow" href="#section-7">&#182;</a>
</div>
<p>Reset the counter</p>
</td>
<td class=code>
<div class='highlight'><pre> <span class="vi">@count</span> <span class="o">=</span> <span class="mi">0</span></pre></div>
</td>
</tr>
<tr id='section-8'>
<td class=docs>
<div class="pilwrap">
<a class="pilcrow" href="#section-8">&#182;</a>
</div>
<p>If there is a block passed in to <code>run</code>, call it here first so it gets a
chance to schedule some work</p>
</td>
<td class=code>
<div class='highlight'><pre> <span class="k">yield</span> <span class="k">if</span> <span class="nb">block_given?</span></pre></div>
</td>
</tr>
<tr id='section-9'>
<td class=docs>
<div class="pilwrap">
<a class="pilcrow" href="#section-9">&#182;</a>
</div>
<p>Here is the main loop. It keeps going until there is no more work to do.</p>
</td>
<td class=code>
<div class='highlight'><pre> <span class="k">until</span> <span class="vi">@callbacks</span><span class="o">.</span><span class="n">empty?</span>
<span class="vi">@count</span> <span class="o">+=</span> <span class="mi">1</span></pre></div>
</td>
</tr>
<tr id='section-10'>
<td class=docs>
<div class="pilwrap">
<a class="pilcrow" href="#section-10">&#182;</a>
</div>
<p>Borrowing from Event Machine we refer to an iteration of the
reactor as a &quot;tick&quot;. Thus the <code>@counter</code> is counting ticks.</p>
</td>
<td class=code>
<div class='highlight'><pre> <span class="nb">puts</span> <span class="s2">&quot;Tick: </span><span class="si">#{</span><span class="vi">@count</span><span class="si">}</span><span class="s2">&quot;</span>
<span class="n">work</span> <span class="o">=</span> <span class="vi">@callbacks</span><span class="o">.</span><span class="n">dup</span></pre></div>
</td>
</tr>
<tr id='section-11'>
<td class=docs>
<div class="pilwrap">
<a class="pilcrow" href="#section-11">&#182;</a>
</div>
<p>Reset <code>@callbacks</code> getting ready for next tick</p>
</td>
<td class=code>
<div class='highlight'><pre> <span class="vi">@callbacks</span> <span class="o">=</span> <span class="o">[]</span></pre></div>
</td>
</tr>
<tr id='section-12'>
<td class=docs>
<div class="pilwrap">
<a class="pilcrow" href="#section-12">&#182;</a>
</div>
<p>Call the callbacks and give them a chance to schedule more work</p>
</td>
<td class=code>
<div class='highlight'><pre> <span class="n">work</span><span class="o">.</span><span class="n">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">callback</span><span class="o">|</span>
<span class="n">callback</span><span class="o">.</span><span class="n">call</span><span class="p">(</span><span class="vi">@count</span><span class="p">)</span> <span class="k">if</span> <span class="n">callback</span><span class="o">.</span><span class="n">respond_to?</span><span class="p">(</span><span class="ss">:call</span><span class="p">)</span>
<span class="k">end</span></pre></div>
</td>
</tr>
<tr id='section-13'>
<td class=docs>
<div class="pilwrap">
<a class="pilcrow" href="#section-13">&#182;</a>
</div>
<p>Slow down the loop, so you can see it ticking (it&#39;s just a demo
- the Event Machine would loop back and start work on the next
tick immediately)</p>
</td>
<td class=code>
<div class='highlight'><pre> <span class="nb">sleep</span> <span class="mi">0</span><span class="o">.</span><span class="mi">5</span>
<span class="k">end</span>
<span class="k">end</span></pre></div>
</td>
</tr>
<tr id='section-14'>
<td class=docs>
<div class="pilwrap">
<a class="pilcrow" href="#section-14">&#182;</a>
</div>
<p>A convenience method to schedule some work. Pass in a block to have
it executed &quot;later&quot; (actually it will be called on the next tick of
the reactor loop).</p>
<p>N.B. Event Machine itself has a <code>next_tick</code> method that does the same
thing.</p>
</td>
<td class=code>
<div class='highlight'><pre><span class="k">def</span> <span class="nf">later</span>
<span class="nb">puts</span> <span class="s2">&quot;Appending&quot;</span></pre></div>
</td>
</tr>
<tr id='section-15'>
<td class=docs>
<div class="pilwrap">
<a class="pilcrow" href="#section-15">&#182;</a>
</div>
<p>Schedule some work, just yielding to the block given to <code>later</code>.
The argument <code>|x|</code> will be the current tick count.</p>
</td>
<td class=code>
<div class='highlight'><pre> <span class="vi">@callbacks</span> <span class="o">&lt;&lt;</span> <span class="nb">lambda</span> <span class="k">do</span> <span class="o">|</span><span class="n">x</span><span class="o">|</span>
<span class="nb">puts</span> <span class="s2">&quot;Executing: </span><span class="si">#{</span><span class="n">x</span><span class="si">}</span><span class="s2">&quot;</span>
<span class="k">yield</span> <span class="n">x</span> <span class="k">if</span> <span class="nb">block_given?</span>
<span class="k">end</span>
<span class="k">end</span></pre></div>
</td>
</tr>
<tr id='section-16'>
<td class=docs>
<div class="pilwrap">
<a class="pilcrow" href="#section-16">&#182;</a>
</div>
<p>A &quot;business&quot; method, just prepends <code>#{prefix}-</code> to the input</p>
</td>
<td class=code>
<div class='highlight'><pre><span class="k">def</span> <span class="nf">prepend</span><span class="p">(</span><span class="n">prefix</span><span class="p">,</span> <span class="n">input</span><span class="p">)</span>
<span class="nb">puts</span> <span class="s2">&quot;Processing: [</span><span class="si">#{</span><span class="n">prefix</span><span class="si">}</span><span class="s2">,</span><span class="si">#{</span><span class="n">input</span><span class="si">}</span><span class="s2">]&quot;</span>
<span class="s2">&quot;</span><span class="si">#{</span><span class="n">prefix</span><span class="si">}</span><span class="s2">-</span><span class="si">#{</span><span class="n">input</span><span class="si">}</span><span class="s2">&quot;</span>
<span class="k">end</span></pre></div>
</td>
</tr>
<tr id='section-17'>
<td class=docs>
<div class="pilwrap">
<a class="pilcrow" href="#section-17">&#182;</a>
</div>
<p>This is an example of correct, but ugly, use of the reactor loop to do
asynchrounous work. Each invocation of <code>later</code> returns immediately,
handing control back to the reactor. To execute code sequentially, it
has to be nested in the callbacks, resulting in code that is hard to
read and also hard to extract into helper methods. There are three
nested calls to <code>later</code> here, so the work gets done within three ticks
of the reactor.</p>
</td>
<td class=code>
<div class='highlight'><pre><span class="n">run</span> <span class="p">{</span>
<span class="n">later</span> <span class="p">{</span>
<span class="n">result</span> <span class="o">=</span> <span class="n">prepend</span><span class="p">(</span><span class="s1">&#39;one&#39;</span><span class="p">,</span> <span class="s1">&#39;done&#39;</span><span class="p">)</span></pre></div>
</td>
</tr>
<tr id='section-18'>
<td class=docs>
<div class="pilwrap">
<a class="pilcrow" href="#section-18">&#182;</a>
</div>
<p>We can&#39;t simply end the block and return control here because we
want to accumulate the result and the work has not been done
yet. We need another, nested, callback to make use of the
result.</p>
</td>
<td class=code>
<div class='highlight'><pre> <span class="n">later</span> <span class="p">{</span>
<span class="n">result</span> <span class="o">=</span> <span class="n">prepend</span><span class="p">(</span><span class="s1">&#39;two&#39;</span><span class="p">,</span> <span class="n">result</span><span class="p">)</span>
<span class="n">later</span> <span class="p">{</span>
<span class="n">result</span> <span class="o">=</span> <span class="n">prepend</span><span class="p">(</span><span class="s1">&#39;three&#39;</span><span class="p">,</span> <span class="n">result</span><span class="p">)</span>
<span class="nb">puts</span> <span class="s2">&quot;Result: </span><span class="si">#{</span><span class="n">result</span><span class="si">}</span><span class="s2">&quot;</span></pre></div>
</td>
</tr>
<tr id='section-19'>
<td class=docs>
<div class="pilwrap">
<a class="pilcrow" href="#section-19">&#182;</a>
</div>
<p>At the end of the chain we have prepended three times:</p>
</td>
<td class=code>
<div class='highlight'><pre> <span class="n">assert</span> <span class="p">{</span> <span class="n">result</span> <span class="o">==</span> <span class="s2">&quot;three-two-one-done&quot;</span> <span class="p">}</span></pre></div>
</td>
</tr>
<tr id='section-20'>
<td class=docs>
<div class="pilwrap">
<a class="pilcrow" href="#section-20">&#182;</a>
</div>
<p>We finished on the third tick:</p>
</td>
<td class=code>
<div class='highlight'><pre> <span class="n">assert</span> <span class="p">{</span> <span class="vi">@count</span> <span class="o">==</span> <span class="mi">3</span> <span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">}</span></pre></div>
</td>
</tr>
<tr id='section-21'>
<td class=docs>
<div class="pilwrap">
<a class="pilcrow" href="#section-21">&#182;</a>
</div>
<p>A convenience function that shows the typical Fiber idiom with a
rector loop like Event Machine. You pass it a block to execute later,
and it wraps it in a call to the Fiber APIs so that it can be treated
as a synchronous call, even though it is happening asynchronously.</p>
<p>The trick is composed of three parts:</p>
<ol>
<li><p>Use of <code>Fiber::yield</code> to return immediately to the program flow in
the original caller of <code>Fiber#resume</code>.</p></li>
<li><p>Use of <code>Fiber#resume</code> with an argument to return a value back to
the yield point.</p></li>
<li><p>Taking advantage of that argument coming back to yield a result to
the caller of <code>fiberwise</code>.</p></li>
</ol>
<p>To use this trick the reactor loop has to be started inside a Fiber
(see example below), otherwise the caller will see an error (&quot;can&#39;t
yield from root fiber (FiberError)&quot;).</p>
</td>
<td class=code>
<div class='highlight'><pre><span class="k">def</span> <span class="nf">fiberwise</span>
<span class="n">fibre</span> <span class="o">=</span> <span class="no">Fiber</span><span class="o">.</span><span class="n">current</span>
<span class="n">later</span> <span class="k">do</span> <span class="o">|</span><span class="n">it</span><span class="o">|</span>
<span class="n">result</span> <span class="o">=</span> <span class="k">yield</span> <span class="n">it</span></pre></div>
</td>
</tr>
<tr id='section-22'>
<td class=docs>
<div class="pilwrap">
<a class="pilcrow" href="#section-22">&#182;</a>
</div>
<p>This is the second part of the trick: resume the Fiber and pass
back the result to the yield point.</p>
</td>
<td class=code>
<div class='highlight'><pre> <span class="n">fibre</span><span class="o">.</span><span class="n">resume</span> <span class="n">result</span>
<span class="k">end</span></pre></div>
</td>
</tr>
<tr id='section-23'>
<td class=docs>
<div class="pilwrap">
<a class="pilcrow" href="#section-23">&#182;</a>
</div>
<p>The first time we hit this line is the first part of the trick: it
returns control to the original resumer of the Fiber (the reactor
loop), but crucially <em>does not</em> return from the <code>fiberwise</code>
method.</p>
<p>The second time we hit this line is the third part of the trick:
it happens after the <code>fibre.resume</code> call above, which is called
from the <code>later</code> block, and it is at that point that the caller of
<code>fiberwise</code> gets its result.</p>
</td>
<td class=code>
<div class='highlight'><pre> <span class="no">Fiber</span><span class="o">.</span><span class="n">yield</span>
<span class="k">end</span></pre></div>
</td>
</tr>
<tr id='section-24'>
<td class=docs>
<div class="pilwrap">
<a class="pilcrow" href="#section-24">&#182;</a>
</div>
<p>Using the <code>fiberwise</code> method to schedule work in a Fiber. To use it
the reactor has to be started with <code>Fiber.new { ... }.resume</code>.</p>
</td>
<td class=code>
<div class='highlight'><pre><span class="n">run</span> <span class="p">{</span>
<span class="no">Fiber</span><span class="o">.</span><span class="n">new</span> <span class="p">{</span>
<span class="n">result</span> <span class="o">=</span> <span class="n">fiberwise</span> <span class="p">{</span></pre></div>
</td>
</tr>
<tr id='section-25'>
<td class=docs>
<div class="pilwrap">
<a class="pilcrow" href="#section-25">&#182;</a>
</div>
<p>Do some work later in a Fiber, the result of which is to be
assigned to a variable <code>result</code>.</p>
</td>
<td class=code>
<div class='highlight'><pre> <span class="n">prepend</span><span class="p">(</span><span class="s1">&#39;one&#39;</span><span class="p">,</span> <span class="s1">&#39;done&#39;</span><span class="p">)</span></pre></div>
</td>
</tr>
<tr id='section-26'>
<td class=docs>
<div class="pilwrap">
<a class="pilcrow" href="#section-26">&#182;</a>
</div>
<p>This is a bit complicated, but it&#39;s the crucial difference of
using Fibers:</p>
<ul>
<li><p>At this point the Fiber has yielded, returning control to
the first call to <code>Fiber#resume</code>, which drops us into the
reactor loop.</p></li>
<li><p>The loop goes round and the work scheduled above is
executed, and when that is finished <code>fiberwise</code> also calls
<code>Fiber#resume</code>, this time with an argument (the result of the
nested block).</p></li>
<li><p>Control returns to the yield point and returns the result to
the caller of <code>fiberwise</code>.</p></li>
</ul>
</td>
<td class=code>
<div class='highlight'><pre> <span class="p">}</span></pre></div>
</td>
</tr>
<tr id='section-27'>
<td class=docs>
<div class="pilwrap">
<a class="pilcrow" href="#section-27">&#182;</a>
</div>
<p>The first tick has now completed before we schedule any more
work, so we can accumulate the result sequentially instead of
in nested callbacks.</p>
</td>
<td class=code>
<div class='highlight'><pre> <span class="n">result</span> <span class="o">=</span> <span class="n">fiberwise</span> <span class="p">{</span>
<span class="n">prepend</span><span class="p">(</span><span class="s1">&#39;two&#39;</span><span class="p">,</span> <span class="n">result</span><span class="p">)</span>
<span class="p">}</span>
<span class="n">result</span> <span class="o">=</span> <span class="n">fiberwise</span> <span class="p">{</span>
<span class="n">prepend</span><span class="p">(</span><span class="s1">&#39;three&#39;</span><span class="p">,</span> <span class="n">result</span><span class="p">)</span>
<span class="p">}</span>
<span class="nb">puts</span> <span class="s2">&quot;Result: </span><span class="si">#{</span><span class="n">result</span><span class="si">}</span><span class="s2">&quot;</span></pre></div>
</td>
</tr>
<tr id='section-28'>
<td class=docs>
<div class="pilwrap">
<a class="pilcrow" href="#section-28">&#182;</a>
</div>
<p>At the end of the chain we have prepended three times:</p>
</td>
<td class=code>
<div class='highlight'><pre> <span class="n">assert</span> <span class="p">{</span> <span class="n">result</span> <span class="o">==</span> <span class="s2">&quot;three-two-one-done&quot;</span> <span class="p">}</span></pre></div>
</td>
</tr>
<tr id='section-29'>
<td class=docs>
<div class="pilwrap">
<a class="pilcrow" href="#section-29">&#182;</a>
</div>
<p>We finished on the third tick:</p>
</td>
<td class=code>
<div class='highlight'><pre> <span class="n">assert</span> <span class="p">{</span> <span class="vi">@count</span> <span class="o">==</span> <span class="mi">3</span> <span class="p">}</span>
<span class="p">}</span><span class="o">.</span><span class="n">resume</span>
<span class="p">}</span></pre></div>
</td>
</tr>
</table>
</div>
</body>
=begin
It is easy to find examples of code using [Event Machine][em] with
[Fibers][fiber] in Ruby web apps (e.g. [see this article by Ilya
Grigorik][ilya], or the [fibered example from
em-http-request][emhttp]). People do this a lot to get the benefit of
scalability that comes from using Event Machine in an I/O intensive
application, without the ugly programming model that you get with lots
of nested callbacks. See below for some examples of what I mean by
that.
Event Machine is a reactor loop. It spins round and round waiting for
asynchronous I/O to update or complete. It is often used when there
is a lot of database traffic to and from a web application, or
increasingly these days, if you need to access web-based APIs from
inside application code. The (huge) benefit is that many independent
requests to these external services can be handled concurrently.
There are special asynchronous versions of, in particular, network I/O
that you need to use in conjunction with Event Machine to get the
benefit of the reactor. These are provided for you nicely wrapped in
a load of "protocol" libraries, e.g. [Event Machine Http
Request](https://github.com/eventmachine/em-http-request). The aim of
this article is to strip away the complications of the Event Machine
and helper APIs and just look at the bare bones of what is happening,
and how a reactor loop interacts with Fibers.
Fibers are a Ruby language feature. One way to think of them is that
they are like threads which you schedule yourself, instead of letting
the virtual machine handle scheduling pre-emptively. They can be used
in combination with the reactor in Event Machine to give the illusion
of serialization - calling methods in the order that you need them.
For the small overhead of using the Fiber API you can order the calls
to the asynchronous APIs naturally, and not worry about the fact that
they do not return their results immediately.
You can get the source code for this article from
[Gist](https://gist.github.com/2112117).
[em]: https://github.com/eventmachine/eventmachine
[emhttp]: https://github.com/igrigorik/em-http-request/blob/master/examples/fibered-http.rb
[fiber]: http://www.ruby-doc.org/core-1.9.3/Fiber.html
[ilya]: http://www.igvita.com/2009/05/13/fibers-cooperative-scheduling-in-ruby
=end
# First we pull in the fiber library and enhance it in a helper to
# print out some extra debugging information. Replace this with
# `require 'fiber'` if you don't have the helper file.
require './fiber_helper'
# Simple assert method (keeps dependencies to a minimum):
def assert
raise "Assertion failed !" unless yield
end
# This is an array of callbacks for the reactor to run:
#
# When it runs out of stuff to do it will terminate.
@callbacks = []
# This is just a running counter of the number of iterations of the
# reactor loop. It is passed into the callbacks, so if they want to
# they can use it to "time" their work.
@count = 0
=begin
This is the definition of the reactor. It is one method that accepts
a block to run, like a simplified version of
[EM.run][emrun]. If the block schedules more work by
appending callbacks to `@callbacks` then the loop starts again, if
those callbacks in turn create more work, then the work will be
scheduled for the next iteration.
This simulates the Event Machine reactor loop, in the sense that
everything you pass in is executed asynchronously (just like in Event
Machine), but also simplifies the reactor because the work is always
executed in the next tick, instead of possibly having to wait for I/O
to complete. This makes it easy to reason about without introducing
any additional dependencies that might complicate things. You should
be able to get the same results with `EM.run`.
[emrun]: http://eventmachine.rubyforge.org/EventMachine.html#M000461
=end
def run
# Reset the counter
@count = 0
# If there is a block passed in to `run`, call it here first so it gets a
# chance to schedule some work
yield if block_given?
# Here is the main loop. It keeps going until there is no more work to do.
until @callbacks.empty?
@count += 1
# Borrowing from Event Machine we refer to an iteration of the
# reactor as a "tick". Thus the `@counter` is counting ticks.
puts "Tick: #{@count}"
work = @callbacks.dup
# Reset `@callbacks` getting ready for next tick
@callbacks = []
# Call the callbacks and give them a chance to schedule more work
work.each do |callback|
callback.call(@count) if callback.respond_to?(:call)
end
# Slow down the loop, so you can see it ticking (it's just a demo
# - the Event Machine would loop back and start work on the next
# tick immediately)
sleep 0.5
end
end
=begin
A convenience method to schedule some work. Pass in a block to have
it executed "later" (actually it will be called on the next tick of
the reactor loop).
N.B. Event Machine itself has a `next_tick` method that does the same
thing.
=end
def later
puts "Appending"
# Schedule some work, just yielding to the block given to `later`.
# The argument `|x|` will be the current tick count.
@callbacks << lambda do |x|
puts "Executing: #{x}"
yield x if block_given?
end
end
# A "business" method, just prepends `#{prefix}-` to the input
def prepend(prefix, input)
puts "Processing: [#{prefix},#{input}]"
"#{prefix}-#{input}"
end
=begin
This is an example of correct, but ugly, use of the reactor loop to do
asynchrounous work. Each invocation of `later` returns immediately,
handing control back to the reactor. To execute code sequentially, it
has to be nested in the callbacks, resulting in code that is hard to
read and also hard to extract into helper methods. There are three
nested calls to `later` here, so the work gets done within three ticks
of the reactor.
=end
run {
later {
result = prepend('one', 'done')
# We can't simply end the block and return control here because we
# want to accumulate the result and the work has not been done
# yet. We need another, nested, callback to make use of the
# result.
later {
result = prepend('two', result)
later {
result = prepend('three', result)
puts "Result: #{result}"
# At the end of the chain we have prepended three times:
assert { result == "three-two-one-done" }
# We finished on the third tick:
assert { @count == 3 }
}
}
}
}
=begin
A convenience function that shows the typical Fiber idiom with a
rector loop like Event Machine. You pass it a block to execute later,
and it wraps it in a call to the Fiber APIs so that it can be treated
as a synchronous call, even though it is happening asynchronously.
The trick is composed of three parts:
1. Use of `Fiber::yield` to return immediately to the program flow in
the original caller of `Fiber#resume`.
2. Use of `Fiber#resume` with an argument to return a value back to
the yield point.
3. Taking advantage of that argument coming back to yield a result to
the caller of `fiberwise`.
To use this trick the reactor loop has to be started inside a Fiber
(see example below), otherwise the caller will see an error ("can't
yield from root fiber (FiberError)").
=end
def fiberwise
fibre = Fiber.current
later do |it|
result = yield it
# This is the second part of the trick: resume the Fiber and pass
# back the result to the yield point.
fibre.resume result
end
# The first time we hit this line is the first part of the trick: it
# returns control to the original resumer of the Fiber (the reactor
# loop), but crucially _does not_ return from the `fiberwise`
# method.
#
# The second time we hit this line is the third part of the trick:
# it happens after the `fibre.resume` call above, which is called
# from the `later` block, and it is at that point that the caller of
# `fiberwise` gets its result.
Fiber.yield
end
=begin
Using the `fiberwise` method to schedule work in a Fiber. To use it
the reactor has to be started with `Fiber.new { ... }.resume`.
=end
run {
Fiber.new {
result = fiberwise {
# Do some work later in a Fiber, the result of which is to be
# assigned to a variable `result`.
prepend('one', 'done')
# This is a bit complicated, but it's the crucial difference of
# using Fibers:
#
# * At this point the Fiber has yielded, returning control to
# the first call to `Fiber#resume`, which drops us into the
# reactor loop.
#
# * The loop goes round and the work scheduled above is
# executed, and when that is finished `fiberwise` also calls
# `Fiber#resume`, this time with an argument (the result of the
# nested block).
#
# * Control returns to the yield point and returns the result to
# the caller of `fiberwise`.
}
# The first tick has now completed before we schedule any more
# work, so we can accumulate the result sequentially instead of
# in nested callbacks.
result = fiberwise {
prepend('two', result)
}
result = fiberwise {
prepend('three', result)
}
puts "Result: #{result}"
# At the end of the chain we have prepended three times:
assert { result == "three-two-one-done" }
# We finished on the third tick:
assert { @count == 3 }
}.resume
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment