Skip to content

Instantly share code, notes, and snippets.

@abdellatif98-a
Forked from dwayne/0-meta.md
Created August 12, 2021 10:47
Show Gist options
  • Save abdellatif98-a/3c3bff4f1f100b1b1f37293100558655 to your computer and use it in GitHub Desktop.
Save abdellatif98-a/3c3bff4f1f100b1b1f37293100558655 to your computer and use it in GitHub Desktop.
Ruby Under a Microscope: Notes (sr bookclub)

Ruby Under a Microscope

Introduction

To understand how Ruby works, read its C source code. After learning each part of Ruby's internal implementation we perform an experiment and use Ruby to test itself.

Most of the book discusses how MRI works.

MRI (Matz's Ruby Interpreter) was invented in 1993 by Yukihiro Matsumoto a.k.a Matz.

Alternative implementations:

  • RubyMotion - Write cross-platform apps for iOS, Android and OS X in Ruby
  • MacRuby - An implementation of Ruby 1.9 directly on top of Mac OS X core technologies such as the Objective-C runtime and garbage collector, the LLVM compiler infrastructure and the Foundation and ICU frameworks
  • IronRuby - An open-source implementation of the Ruby programming language which is tightly integrated with the .NET Framework
  • Topaz - A high performance implementation of the Ruby programming language, written in Python on top of RPython
  • JRuby - A high performance, stable, fully threaded Java implementation of the Ruby programming language
  • Rubinius - An implementation of Ruby designed for concurrency using native threads to run Ruby code on all the CPU cores
  • mruby - A lightweight implementation of the Ruby language complying with part of the ISO standard

The JRuby and Rubinius implementations are explored in detail in Chapters 10, 11 and 12.

Tokenization and Parsing

Ruby reads and transforms your code 3 times (code ---- tokenize -- (tokens) -- parse -- (AST nodes) -- compile ---- YARV instructions) before running it.

Ruby's virtual machine is called "Yet Another Ruby Virtual Machine" (YARV).

Tokens

The tokenization process transforms the source code (a stream of characters) into the words that make up the language.

Ripper has no idea whether the code you give it is valid Ruby or not. If you pass in code that contains a syntax error, Ripper will just tokenize it as usual and not complain. It's the parser's job to check syntax.

Parsing

Words/tokens are grouped into sentences or phrases that make sense to Ruby.

Ruby uses a parser generator. Parser generators take a series of grammar rules as input that describe the expected order and patterns in which the tokens will appear. Ruby uses a newer version of Yacc (Yet Another Compiler Compiler) called Bison (a LALR parser generator).

LALR = Look-Ahead left-to-right, rightmost derivation

The grammar rule file is parse.y. It contains the language definition.

Ruby runs Bison at build time to create the actual parser code (Grammar rules, parse.y ---- Generate parser, Bison ---- Parser code, parse.c).

Note: The tokenization and parsing processes occur simultaneously.

Ruby's -y options displays internal debug information every time the parser jumps from one state to another. It's useful for getting a sense of the complexity of Ruby's state table.

Display debug information about your code's AST using the parsetree option.

$ ruby --dump parsetree your_script.rb

Programming Language Tools for Ruby

Compilation

Ruby 1.9+ compiles your code. You don't use Ruby's compiler directly, it runs automatically.

Note: No compiler for Ruby 1.8. It immediately executes your code after the tokenizing and parsing processes are finished.

With Ruby 1.9, Koichi Sasada and the Ruby core team introduced Yet Another Ruby Virtual Machine (YARV), which actually executes your Ruby code.

When using YARV you first compile your code into bytecode, a series of low-level instructions that the virtual machine understands.

Differences between YARV and the JVM:

  • Ruby doesn't expose the compiler to you as a separate tool.
  • Ruby never compiles your Ruby code all the way to machine language.

YARV is a stack-oriented virtual machine.

It is not just a stack machine; it's a double-stack machine!

Note: In Ruby all functions are actually methods. That is, functions are always associated with a Ruby class; there is always a receiver. Inside of Ruby, however, Ruby's parser and compiler distinguish between functions and methods: Method calls have an explicit receiver, while function calls assume the receiver is the current value of self.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment