Skip to content

Instantly share code, notes, and snippets.

@lisp-ceo
Created February 15, 2016 23:23
Show Gist options
  • Save lisp-ceo/b4f09ac96835dfd7dceb to your computer and use it in GitHub Desktop.
Save lisp-ceo/b4f09ac96835dfd7dceb to your computer and use it in GitHub Desktop.
Ruby Memory Model
Ruby Memory Model
Indented italic paragraphs are used for notes.
This document is work-in-progress. Intentions of this effort and document are: to summarize the behavior of Ruby in concurrent and parallel environments, initiate the discussion, identify problems in the document, find flaws in Ruby implementations if any, suggest what has to be enhanced in Ruby itself and cooperate on adding the enhancements to Ruby implementations.
It is not the intention of this effort to introduce high-level concurrency abstractions like actors to the language, but rather to improve low-level concurrency support to add many more concurrency abstractions through gems.
For readers and reviewers: Thank you for taking time to read this document and provide feedback or ask questions, both are very valuable. The document is locked against direct editing, but please comment by selecting text and clicking “Insert comment” button, or you can suggest changes by directly editing the text. You can choose to be notified about all comments by navigating to: Comments > Notifications > All. Even though you can comment without signing in to google, please do. It will make the discussion easier to navigate, and it will allow you to be contacted for further questions if needed.
Ruby memory model
Why? Ruby is a great and expressive language, but it lacks in support for well-defined low-level concurrent and parallel computation. It is hoped that this document will provide ground steps for Ruby to become as good in this area as in others.
Without a memory model it's very hard to write concurrent abstractions for Ruby. To write a proper concurrent abstraction it often means to reimplement it more than once for different Ruby runtimes, which is very time-consuming and error-prone. This memory model and extensions in concurrent-ruby gem layer allows to implement any abstraction only once.
The Ruby memory model is a framework allowing to reason about programs in concurrent and parallel environment. It defines what variable writes can be observed by a particular variable read, which is essential to be able to determine if a program is correct. It is achieved by defining which subset of all possible program execution orders is allowed.
The memory model described in this document is intentionally weak to not hinder performance. It was extended in concurrent-ruby gem (described in a different document) to be able to write concurrent abstractions effectively. Based on the experience collected in the concurrent-ruby extensions, another document was created which proposes similar extensions to the Ruby language.
Sources used for the memory model:
Java memory model, and its FAQ
Java Memory Model Pragmatics
atomic<> Weapons 1 and 2
C++11 interactive MM
Go memory model
Sources for the concurrent behavior of Ruby implementations:
Source code.
JRuby's wiki page
Rubinius's wiki page
A similar document for MRI was not found. A key fact about MRI is the GVL (Global VM lock) which ensures that only one thread can interpret Ruby code at any given time. When the GVL is handed from one thread to another a mutex is released by the first and acquired by the second thread implying that everything done by the first thread is visible (published) to second thread. See thread_pthread.c and thread_win32.c.
This memory model was created by: comparing MRI, JRuby, JRuby+Truffle and Rubinius; taking in account the limitations of the implementations or their platforms; inspiration was drawn from other existing memory models (Java and C++11). This is not a formal model.
The happens-before relation
This text was adapted from the Go memory model.
Within a single Thread, reads and writes must behave as if they are executed in the order specified by the program. Compilers and processors may reorder the reads and writes executed within a single Thread only when the reordering does not change the behavior within the Thread. Because of this reordering, the execution order observed by one Thread may differ from the order perceived by another. For example, if one Thread executes a = 1; b = 2, another might observe the updated value of b before the updated value of a.
To specify the requirements of reads and writes, the happens-before relation is defined. It is a partial order on the execution of memory operations in a Ruby program. If event e1 happens-before event e2, then event e2 happens-after e1. Also, if e1 does not happen-before e2 and does not happen after e2, then e1 and e2 happen concurrently (in other words the order is not defined between events e1 and e2, they may also be executed in parallel).
Within a single Thread, the happens-before order is the order expressed by the program.
A read r of a variable v is allowed to observe a write w to v if both of the following hold:
r does not happen-before w.
There is no other write w' to v that happens after w but before r.
To guarantee that a read r of a variable v observes a particular write w to v, we must ensure that w is the only write r is allowed to observe. That is, r is guaranteed to observe w if both of the following hold:
w happens before r.
Any other write to the shared variable v either happens before w or after r.
This pair of conditions is stronger than the first pair; it requires that there are no other writes happening concurrently with w or r.
Within a single Thread for a not shared variable v, there is no concurrency, so the two definitions are equivalent: a read r observes the value written by the most recent write w to v. When multiple Threads access a shared variable v, they must use synchronization events to establish happens-before conditions that ensure reads observe the desired writes.
Other sections of this document are later adding new operations which create happens-before order, like volatility property.
Core behavior
Following sections covers the various storages in the Ruby language (e.g. local variable, instance variable, etc.). We consider the following operations:
read - reading a value from an already defined storage
write - writing a value from an already defined storage
define - creates a new storage and stores the default value or a supplied value
undefine - removes an existing storage
Key properties are:
volatility (V) - A written value is immediately visible to any subsequent volatile read of the same variable on any Thread. It has same meaning as in Java (volatile keyword in C has a different meaning), it provides sequential consistency. A volatile write happens-before any subsequent volatile read of the same variable.
Some variable types do not make sense without volatility (e.g. global variables). Other types would have big performance penalty (e.g. local and instance variables). Therefore each type has to be considered separately.
atomicity (A) - Operation is either done or not as a whole.
serializability (S) - Operations are serialized in some order (they cannot disappear). This is a new property not mentioned in other memory models, since Java and C++ do not have dynamically defined fields. Scope of the serialized operations is its container, e.g. an object is a serializability scope for instance variables.
Each section will describe what properties a given storage has on the above listed operations and it will argue why.
All operations are atomic on all storages. This implies that Ruby implementations on JVM which are using long for Integers (when they fit) (or double for Float) has to ensure that read and write operations are done atomically. Read and write operations were chosen to be atomic since everything in Ruby is an Object and therefore it’s natural that reference assignment behaves atomically.
Define and undefine operations are considered atomic for pragmatic reasons, otherwise users would have to manually synchronize against the serializability scope whenever there is new storage defined.
Another problematic part is implementation of non-volatile storage with serializability (hereafter non-volatile serializability storage problem). The write and read are non-volatile operations therefore they should be without any synchronisation or memory barriers to achieve optimal performance. However they also have serializability property in regards with variable defining and undefining. It presents an interesting problem, since the number of variables is dynamic the implementation of value storage contains usually an array of values which has to be grown when it's too small. The expansion of the array combined with parallel updates of the values in the old array can lead to lost variable updates.
It can happen as follows: let’s have two threads t1 and t2. t1 is updating variable v1 which is only accessed from t1 (therefore it does not need to be synchronised). Simultaneously t2 adds a new variable triggering array expansion. If the ordering is: t1 reads storage array, t2 creates new storage array, t2 copies old values to the new storage array, t1 updates value to old storage array, t2 stores new storage array, t1 later reads value of the variable but it will get old value copied by t2, then it will appear to t1 as its variable update was lost for no reason. For that reason no write can be executed during storage array expansion.
Local variables
volatility - no
atomicity - yes
serializability - no
scope - frame
Local variables are determined during parsing. They cannot be dynamically added to a frame defined in source code, therefore these frames do not suffer by non-volatile serializability storage problem since there are no (un)define operations to serialize with, they behave as if they had serializability property. Read and write operations have optimal performance.
Binding object obtained Kernel#binding or Proc#binding have another internal frame holding newly defined variables with local_variable_set (otherwise it would allow to define local variables in a frame defined in the source code). Since local variables do not have serializability, local variable writes and definitions have to be protected by used to avoid lost updates.
Local variables cannot be volatile for performance reasons. It would effectively prevents compiler to do optimizations since nothing could be reordered.
Since local variables are not volatile an old value or a default value can be read. However, out-of-thin-air values are not allowed.
Instance variables
volatility - no
atomicity - yes
serializability - yes
scope - an object
Newly defined instance variables have to become visible eventually. As discussed above, instance variable writes cannot be lost.
Instance variables cannot be volatile for performance reasons. It would hurt performance of read and write operations on instance variables of objects which are not shared between threads, which is the case of most objects.
Since instance variables are not volatile an old value or a default value can be read. However, out-of-thin-air values are not allowed.
Class variables
volatility - yes
atomicity - yes
serializability - yes
scope - all children and all parents of a class where it’s defined
Class variable resolution leads to quite a big scope. Class variables are somewhat deprecated and used only rarely, for globally shared objects and flags. Therefore having class variables volatile is natural choice as the impact is small and the volatile guarantees very desirable.
One implementation challenge when defining a new class variable is that it should perform atomically and with sufficient synchronization such that only one storage for the given name must exist at any time in that class/module hierarchy.
Global variables
volatility - yes
atomicity - yes
serializability - yes
scope - ruby runtime
Global variables are volatile for user convenience creating always up-to-date globally shared flags, performance is not critical.
Thread-local global variables (such as $!, $&, etc) are a special case. Volatility and serializability are not required since they cannot be read outside of their thread and cannot be defined by Ruby code.
Constant variables
volatility - yes
atomicity - yes
serializability - yes
scope - a module
A Module or a Class definition is actually a constant definition. The definition is atomic, it assigns the Module or the Class to the constant, then its methods are defined atomically one by one.
It’s desirable that once a constant is defined it and its value is immediately visible to all threads, therefore it’s volatile.
Ruby implementations may take advantage of constancy of the variables to avoid doing volatile reads on each constant variable read. MRI can check a version number. JRuby can use SwitchPoint, and JRuby+Truffle can use Assumptions, where both allow to treat the values as real constants during compilation.
Thread local variable / Fiber local variables
volatility - no
atomicity - yes
serializability - yes
scope - a thread / a fiber
Thread / Fiber local variables are mostly read and written by the same thread / fiber, therefore they are not volatile to not reduce performance.
Method table
volatility - almost yes
atomicity - yes
serializability - yes
scope - a class
Methods are also stored where operations defacto are: read -> method lookup, write -> method redefinition, define -> method definition, undefine -> method removal. Operations over method tables have to be visible as soon as possible otherwise Threads could execute different versions of methods leading to unpredictable behaviour, therefore they are marked volatile. When a method is updated and the method is being executed by a thread, the thread will finish the method body and it’ll use the updated method obtained on next method lookup.
Module operations like include, prepend and extend affect how methods are looked up. Theirs effect also has to be immediate and atomic so that any method lookup sees their changes.
Formulation visible as soon as possible was chosen to allow certain exceptions. For example, if an optimizing implementation of Ruby inlines method i several times into a short loopless method m, and i is updated during execution of m, it’s much faster to let the method m finish and then ensure that next time m is called it will use updated method i. On the JVM, it means to let the compiled methods reach the nearest safepoint. It’s up to the Ruby implementation to make sure the delay is very short and to ensure that the method update will be always picked up.
Dangers
||=, +=, etc. are actually two operations: read and write which implies that it's not an atomic operation. See volatile variables with compare-and-set to get around the issue.
Method invocation does not have any special properties. An initialize method does not have either, that implies that instance variables assigned in initialize are not guaranteed to be visible on other threads when the object is shared.
Current Implementation differences from the model
MRI: has currently a stronger and undocumented model than the described one.
JRuby: Thread and Fiber local variables are volatile.
Class variables: MRI: AVS, RBX:???, JRuby:AV- (#3357), JRuby+Truffle: A--.
Local variables: MRI: AVS, RBX:A-?, JRuby:A-?, JRuby+Truffle: A-?.
TODO: updated with specific versions of the implementations (MRI (1.9, 2.0, 2.1, 2.2, JRuby (1.7, 9.0), Rubinius (last version and ??), JRuby+Truffle)
Threads
Threads have the same guarantees as in in Java. Thread.new happens-before the execution of the new thread’s block. All operations done by the thread happens-before the thread is joined. In other words, when a thread is started it sees all changes made by its creator and when a thread is joined, the joining thread will see all changes made by the joined thread.
Source loading
Requiring
A given file can be loaded only once. If a file is being loaded in two threads at the same time, only one succeeds and the second require blocks until the first one is done (to ensure visibility of everything defined in that file before continuing after the require). However modules, classes, methods are being defined during the require operation and are visible as soon as they are executed (not all at once at the end of the require). Different files may be required concurrently. Requiring a file makes all operations done in the file visible to the requiring thread, applies also if the file was already loaded. All events done by require ‘a_file’ happen-before the require method returns for a particular file.
Autoload
Only one autoload is executed at a time for a given constant, others will be blocked until the first triggered autoload is completed. However modules, classes, methods are being defined during the autoload operation and are visible as they are executed (not all at once at the end of the autoload). Different constants may be autoloaded concurrently. Autoloading a constant makes all operations done during its loading visible to the loading thread, applies also if the constant was previously loaded. All events done by autoloading happen-before the autoloaded constant returns its value.
Danger
Beware of requiring and autoloading in concurrent programs, it's possible to see partially defined classes. Eager loading or blocking until classes are fully loaded should be used to mitigate.
Core classes
Mutex, Monitor, Queue have to work correctly on each implementation.
A thread leaving a critical section of Mutex.synchronize, happens-before another thread entering the critical section on the same Mutex instance. Same applies for Monitors. A push of a message m to a Queue q happens-before the message m is popped from a Queue q.
Ruby implementation VMs should not crash when for example Array or Hash is used in parallel environment but it may loose updates, or raise Exceptions. (If Array or Hash were synchronized it would have too much overhead when used in a single thread.)
concurrent-ruby contains synchronized versions of Array and Hash and other thread-safe data structure.
TODO: This section needs more work: e.g. Thread.raise and similar is dangerous and an open issue, better not to be used.
Extensions of Ruby memory model in concurrent-ruby
There were extensions created in the concurrent-ruby gem based on the Ruby memory model described here. It allows to be able to build faster concurrent abstractions using volatile variables and other tools, you can learn more in this document.
SImilar extensions are also proposed for Ruby in this document.
Ruby standard libraries
Standard libraries were written for MRI so unless they are rewritten in particular Ruby implementation they may contain hidden problems. Therefore it's better to assume that they are not safe for other implementations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment