Skip to content

Instantly share code, notes, and snippets.

@natw natw/tef.md
Created Oct 16, 2015

Embed
What would you like to do?

VikingofRock posted:

Could someone explain that tef post to someone who doesn't know any ruby whatsoever?

yes, but if you're asking me then i'll have to explain to you how ruby works.

well, explain my current understanding of how ruby works. i'm never really sure if i've reached the bottom of the rabbit hole.

let's open with one description of ruby http://blade.nagaokaut.ac.jp/cgi-bi...179642?matzlisp

Matz posted:

Ruby is a language designed in the following steps:

  • take a simple lisp language (like one prior to CL).
  • remove macros, s-expression.
  • add simple object system (much simpler than CLOS).
  • add blocks, inspired by higher order functions.
  • add methods found in Smalltalk.
  • add functionality found in Perl (in OO way).

So, Ruby was a Lisp originally, in theory. Let's call it MatzLisp from now on. ;-)

so ruby is a lisp, an object system, methods, blocks, and perl syntax. let's skip the lisp and start with the objects:

irb(main):012:0> Object
=> Object

Modules and Classes are two types of Object in ruby, both used to build other objects.

irb(main):001:0> module A
irb(main):002:1> end
=> nil
irb(main):008:0> class B
irb(main):009:1> end

Modules and Classes both can contain methods, but classes can be instantiated to produce instances.

irb(main):010:0> A.class
=> Module
irb(main):011:0> B.class
=> Class
irb(main):005:0> Module.class
=> Class
irb(main):006:0> Class.class
=> Class
irb(main):007:0> Object.class
=> Class

We see that A is an instance of Module, and B is an instance of a class. We can also see that Module, Class, and Object are all instances of the Class object.

irb(main):003:0> A
=> A

What is A? A is a module. Well, A is actually a Constant. A special variable that is bound to a value, which in this instance is an object, which is an instance of Module. Got that? Constants are variables which are looked up in a special way:

When we ask what is A, we look at the current module/class scope and check for a constant named A, and proceed up.

irb(main):025:0> module A
irb(main):026:1> module B
irb(main):027:2> end
irb(main):028:1> end

irb(main):029:0> A::B
=> A::B

we can use :: to look up the constant B inside the object pointed to by the constant A. What happens when we do A is that we're doing nil::A implicitly.

irb(main):031:0> A
=> A
irb(main):032:0> nil::A
=> A

when you write A::B::C in ruby, it is parsed as (A::B)::C for , so if A::B resolves to nil, the entire expression is just nil::C. nil just means "the current module scope".

irb(main):044:0> module A
irb(main):045:1> puts B
irb(main):046:1> end
A::B
=> nil
irb(main):048:0> module A
irb(main):049:1> puts nil::B
irb(main):050:1> end
A::B
=> nil

the current module scope is lexical

irb(main):004:0> module A
irb(main):005:1> Z = 2
irb(main):006:1> module B
irb(main):007:2> Z = 3
irb(main):008:2> puts Z
irb(main):009:2> end
irb(main):010:1> puts Z
irb(main):011:1> end
3
2
=> nil
irb(main):012:0> puts Z
1
=> nil

but remember, if you do module A::B, it is not the same as doing module A; module B. in ruby, the former adds "A::B" to the search scope, and the latter adds "A", "A::B" to the scope. it isn't syntactic sugar, it's a whole different mechanism.

here we have A::B::Z, A::Z and Z. but where does A::B::Z and A::Z and Z live? Object of course!

irb(main):038:0> Object::A
=> A
irb(main):039:0> Object::A::B
=> A::B

indeed, all of the built in classes in ruby are listed as constants under object

irb(main):040:0> Object::String
=> String
irb(main):041:0> Object::Kernel
=> Kernel
irb(main):042:0> Object::Class
=> Class
irb(main):043:0> Object::Module
=> Module

when you do X = 1 in ruby, you're actually setting Object::X to 1.when you define module Router, you're defining module Object::Router. every class and module, and method definition at the top level is in someway monkeypatching Object.

but enough about objects, let's look at methods. then we'll get onto blocks, and the syntax.

to do anything in ruby, you send a message to an object, or call a method

rb(main):003:0> [1,2,3].send(:max)
=> 3
irb(main):004:0> [1,2,3].max
=> 3

these messages live in a different namespace to constants. you can have a class with a method X and a constant Y.

irb(main):006:0> class X
irb(main):007:1> Y = 1
irb(main):008:1> end

irb(main):010:0> def X.Y # define a class method Y
irb(main):011:1> 2
irb(main):012:1> end
=> :Y
irb(main):013:0> X::Y # look up the constant
=> 1
irb(main):014:0> X.Y # look up the method 'Y'
=> 2

but, confusingly, :: can also be used to call methods

irb(main):026:0> "123".size
=> 3
irb(main):027:0> "123"::size
=> 3

the :: operator takes the right hand argument as a literal token, and if the first letter is uppercase, it's a constant, if the first letter is lowercase, it's a method lookup.

constants must start with an uppercase letter. method names on the other hand don't care, because they are symbols.

irb(main):040:0> "123".send(:size)
=> 3

what's that :size? it's a symbol, another type of object.

irb(main):041:0> :size
=> :size
irb(main):042:0> :size.class
=> Symbol

if constants are special variables that start with a capital letter, symbols are special values. they're kinda like immutable strings, and used inside objects to define which methods to invoke. method names are even stricter than constants.

irb(main):054:0> :"foo!o"
=> :"foo!o"
irb(main):055:0> def foo!o 
irb(main):056:1> end
=> :foo!
irb(main):057:0> foo! 1
=> nil

here, the method name is parsed as foo! and the parameter o. method names cannot have spaces, can have underscores, and can end in ? or !.

well, almost. it turns out you can define any method you like, even nulls. you just can't use anything other than send to invoke it:

irb(main):049:0> define_method(:"foo\0bar") do |x| x*2 end
=> :"foo\x00bar"
irb(main):050:0> send(:"foo\0bar", 1)
=> 2

anyway, i'm getting ahead of myself. when you define a method on an object, you can define it on one of two namespaces, the class namespace, and the instance namespace.

irb(main):001:0> class Foo

irb(main):002:1> def self.cls_method
irb(main):003:2> "class"
irb(main):004:2> end

irb(main):005:1> def inst_method
irb(main):006:2> "instance"
irb(main):007:2> end

irb(main):008:1> end

irb(main):009:0> Foo.cls_method
=> "class"
irb(main):010:0> Foo.new.inst_method
=> "instance"

they are in totally different namespaces: you can't call cls_method on a Foo instance.

irb(main):011:0> Foo.new.cls_method
NoMethodError: undefined method `cls_method' for #<Foo:0x007f9022891958>

[if you're keeping count, an object has three namespaces so far: class methods, instance methods, constants]

and as you'd expect for class methods, they're inherited.

irb(main):012:0> class Bar < Foo
irb(main):013:1> end
=> nil
irb(main):014:0> Bar.cls_method
=> "class"
irb(main):015:0> Bar.new.inst_method
=> "instance"

methods are public by default. this means any other object can call that method. protected methods cannot be invoked outside of subclasses, and private methods cannot be invoked outside of the instance. so a private class method cannot be called by an instance method, and vice versa.

speaking of private methods: have you ever wondered how "puts" works in ruby?

irb(main):001:0> puts "butts"
butts
=> nil
irb(main):002:0> "butts".puts
NoMethodError: private method `puts' called for "butts":String

puts is a private method, so we can't call it normally. we can cheat and use send:

irb(main):003:0> "butts".send(:puts)

=> nil

ok, what's happening here. we sent puts to butts but nothing happened

irb(main):004:0> Object.send(:puts, "butts")
butts
=> nil

puts is a private method on object. string inherits from object and thus has puts.

thus we can override it:

irb(main):010:0> class Test
irb(main):011:1> def inst
irb(main):012:2> puts "inst"
irb(main):013:2> end
irb(main):014:1> def self.cls
irb(main):015:2> puts "class"
irb(main):016:2> end
irb(main):017:1> end

irb(main):018:0> Test.cls
class
=> nil

irb(main):019:0> Test.new.inst
inst
=> nil

we can define a class method puts:

irb(main):026:0> class Test
irb(main):027:1> def self.puts x
irb(main):028:2> "ha ha"
irb(main):029:2> end
irb(main):030:1> end
=> :puts
irb(main):031:0> Test.cls
=> "ha ha"
irb(main):032:0> Test.new.inst
inst
=> nil

and we can define a instance method puts.

irb(main):033:0> class Test
irb(main):034:1> def puts x
irb(main):035:2> "lol"
irb(main):036:2> end
irb(main):037:1> end
=> :puts
irb(main):038:0> Test.cls
=> "ha ha"
irb(main):039:0> Test.new.inst
=> "lol"

when you puts "Hello world" in ruby, you're invoking either a private method on Object's class methods or a private method on Object's instance methods (or any subclass in between that defined puts). this is why Object has so many built in methods:

irb(main):002:0> Object.methods.size
=> 100

(this value changes between versions of ruby. don't expect it to match.)

modules also have two namespaces for class and instance methods. unlike class inheritance, when you import a Module you copy the instance methods + constants, and when you extend you import the class methods and constants. this leads to funny ways to define class methods:

irb(main):028:0> module Dave
irb(main):029:1> extend self
irb(main):030:1> def hi
irb(main):031:2> "hello"
irb(main):032:2> end
irb(main):033:1> end
=> :hi
irb(main):034:0> Dave.hi
=> "hello"

there's magic in ruby to make sure this works, even when you haven't defined the methods yet. as well as other tricks to define class methods that would make bjarne stroustrop jealous:

irb(main):040:0> class Helper
irb(main):041:1> end
=> nil
irb(main):042:0> class << Helper
irb(main):043:1> def help
irb(main):044:2> "no"
irb(main):045:2> end
irb(main):046:1> end
=> :help
irb(main):047:0> Helper.help
=> "no"

irb(main):048:0> def Helper.wat
irb(main):049:1> "mate"
irb(main):050:1> end
=> :wat
irb(main):051:0> Helper.wat
=> "mate"

thing is, you can define methods on instances like this, to define per-instance methods, rather than per-classmethods.

irb(main):054:0> d = Dog.new
=> #<Dog:0x007fc8638c74f8>
irb(main):055:0> def d.wag
irb(main):056:1> "bow wow"
irb(main):057:1> end

irb(main):060:0> class << d
irb(main):061:1> def sit
irb(main):062:2> "good dog"
irb(main):063:2> end
irb(main):064:1> end
=> :sit
irb(main):065:0> d.sit
=> "good dog"

when i said there were two namespaces on a class, it would be more accurate to say that there's two classes behind the scenes, the normal class, which contains instance methods, a hidden class, which contains class methods

when you're defining class methods in ruby on a module or class, you're actually defining methods on a hidden eigenclass for the class or module. for subclasses, they have a hidden eigenclass, which inherits from the parent's eigenclass. for modules, include/extend copy from the class or hidden class.

the hidden class is often called an eigenclass, and class methods are often called singleton methods. remembering that constant scope is lexical, you can sorta set one off constants inside instances

class Timer
  TIMEOUT = 100

  def self.timeout
    TIMEOUT
  end

  def timeout
    TIMEOUT
  end
end

t = Timer.new
puts "Timer.timeout = #{Timer.timeout}" # 100
puts "t.timeout = #{t.timeout}" # 100

class << t
  TIMEOUT = 200 # won't change timeout because the one in foo is lexical.
end

puts "Timer.timeout = #{Timer.timeout}" # 100
puts "t.timeout = #{t.timeout}" # 100

class << t
  def timeout
    TIMEOUT # now it picks up the new value.
  end
end

puts "Timer.timeout = #{Timer.timeout}" # 100
puts "t.timeout = #{t.timeout}" # 200

puts "Timer::TIMEOUT = #{Timer::TIMEOUT}" # Now 200

but i guess it's easier to think of objects having constants, instance methods, and class methods, over class or module objects having two classes behind the scenes both with modules and methods.

still, you'll should notice i'm not redefining TIMEOUT, but defining a new TIMEOUT inside a hidden class on an instance.

(fwiw you can reassign constants, and you can define constants in modules or classes, but not method definitions (but you can define constants in blocks if the blocks are defined within a module or class))

i guess it's time to complain about blocks.

blocks.

ugh

if we go back to objects and sending a message, we can think of the class or instance methods as being stored in a lookup table. if the key is a symbol, then the value in the table is a block.

a block in ruby is a piece of code captured inside do ... end or {|x| .... } blocks, and can take parameters. they're lexically scoped, and refer to the environment in which they live.

irb(main):013:0> y = 1
=> 1
irb(main):014:0> x = the_block do y = 2 end
=> #<Proc:0x007fa78c865660@(irb):14>
irb(main):015:0> x.call
=> 2
irb(main):016:0> y
=> 2

as i mentioned earlier, it's just a lump of code that refers to its enclosing scope, so what you do in the scope, you can do in the block

irb(main):017:0> x = the_block do CONST = 2 end
=> #<Proc:0x007fa78c84c458@(irb):17>
irb(main):018:0> x.call
=> 2

since I can assign Constants at the top level, I can do it inside the block too. this is often referred to as the "Tennent's Correspondence Principle", or as i call it, the "blocks are not functions" principle.

irb(main):019:0> def foo
irb(main):020:1> x = the_block do return "ha ha" end
irb(main):021:1> x.call 
irb(main):022:1> return "nice"
irb(main):023:1> end
=> :foo
irb(main):024:0> foo
=> "ha ha"

When you call return inside a block, it returns from the method invoking x.call itself, not from x.call.

in ruby, returns in blocks work like exceptions in that they can be non-local exits.

but when i say block, i really mean proc. a block is not a value in ruby, you cannot assign a block to a value, you have to lift the block into a Proc.

calling the_block is the same as calling proc_new.

irb(main):034:0> def test
irb(main):035:1> p = Proc.new do return "butts" end
irb(main):036:1> call_a_proc p
irb(main):037:1> end

irb(main):038:0> test
=> "butts"

which is why when you call a proc with a return, from the top level scope, well, heh.

irb(main):031:0> p = Proc.new { return "ha" }
=> #<Proc:0x007fa78c038730@(irb):31>
irb(main):032:0> call_a_proc p
LocalJumpError: unexpected return

but you can make return work like you expect it to by wrapping blocks inside lambda.

irb(main):050:0> p = lambda do return 1 end
=> #<Proc:0x007fa78c884a38@(irb):50 (lambda)>
irb(main):051:0> call_a_proc p
=> "done"

you can use next in blocks to emulate return in lambdas. i think

atop of blocks, procs, and lambdas, you also have method objects

irb(main):054:0> "123".method(:size).call
=> 3

and unbound method objects, which you need to bind to an object to call. this works because blocks have no real notion of self:

irb(main):052:0> p = Proc.new do self end
=> #<Proc:0x007fa78c8649e0@(irb):52>
irb(main):053:0> p.call
=> main

irb(main):059:0> p = lambda do self end
=> #<Proc:0x007fa78c05b4d8@(irb):59 (lambda)>
irb(main):060:0> p.call
=> main

self is dynamically scoped and overriden at runtime for lambdas, and blocks, and procs

irb(main):014:0> Foo.new.foo
=> #<Foo:0x007fc49286d0b8>
irb(main):015:0> Foo.new.method(:foo).call
=> #<Foo:0x007fc492864008>

but when you capture a method, you bind self. you can unbind it, but you must rebind the method to an instance of the class it came from. var that = this has nothing on this mess.

on the plus side, whenever you do foo.call it might return from your method, it might throw a non local return error, and maybe it will know what self is.

the whole point of having blocks is so you can do non local returns but in practice no-one uses them much.

i guess i'm on the final stretch of complaining now, so i guess it's time for the features from perl.

the syntax.

on the whole, perl gets a bad rap for many things but it does not take so many words to explain how the object system works. "You're attach a package namespace to a value and when you call a method, it calls the function in the package scope". perl also had modules. ruby has require. even javascript has module loading. ruby has string concatenation.

in the year of our lord 2015 i am still writing a program and building it by pretending it's one gigantic source file.

the syntax of ruby is a whole different beast. whitespace matters deeply in ruby when using the special "method_name args,with,no,parenthesis" syntax.

puts ([1]).map {|x| x+1} 
puts([1]).map {|x| x+1}. 

the other great feature of no-parenthesis method calls is that you can make things that look like built in operators:

foo "fooo" do

end

unfortunately, it's a little easy to forget the do block, or accidentally put a do block atop a built in construct, and then get a missing 'end' error. or my favourite, that elif is not a syntax error but a runtime one. even perl wasn't this clumsy.

so now you've got the basics of how ruby works, you can understand the code i posted, hope this helps.

  1. ruby is bad
  2. bad bad bad
  3. the end

oh wait, i forgot to mention everything returns the most unhelpful value by default.

irb(main):020:0> [1,2,3].sample(4)
=> [1, 3, 2]
irb(main):021:0> [1][200]
=> nil
irb(main):022:0> {}[:butts]
=> nil
irb(main):023:0> "123 aa".to_i
=> 123
irb(main):024:0> "aa 123 aa".to_i
=> 0

oh and on namespaces, i guess i lost count, but there are also instance variables, which exist on the class and the eigenclass, or if you will, the instance and the class. and class variables too, which unlike instance variables aren't nil by default, and in some ways even harder to explain than eigenclasses.

see also:

http://thoughts.codegram.com/unders...iables-in-ruby/ http://madebydna.com/all/code/2011/...emystified.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.