Skip to content

Instantly share code, notes, and snippets.

@Ovid
Last active September 12, 2021 08:02
Show Gist options
  • Save Ovid/68b33259cb81c01f9a51612c7a294ede to your computer and use it in GitHub Desktop.
Save Ovid/68b33259cb81c01f9a51612c7a294ede to your computer and use it in GitHub Desktop.
Cor—A minimal object system for the Perl core

NAME

Cor — A minimal OO proposal for the Perl core

VERSION

This is version 0.10 of this document.

AUTHOR

Curtis "Ovid" Poe

CAVEAT!

Nothing in the following proposal is set in stone.

DESCRIPTION

It has been repeatedly proposed that we have OO in the Perl 5 core. I support this notion. However, there's been much disagreement over what that OO should look like. I propose a simple OO syntax that would nonetheless be modern, but still "feel like Perl 5." Here's a small taste (will be shown again later in the document):

class Cache::LRU {
    use Hash::Ordered;
    
    has cache    => ( default => method { Hash::Ordered->new } );
    has max_size => ( default => method { 20 } );

    method set ( $key, $value ) {
        if ( self->cache->exists($key) ) {
            self->cache->delete($key);
        }
        elsif ( self->cache->keys > self->max_size ) {
            self->cache->shift;
        }
        self->cache->set( $key, $value );
    }

    method get ($key) { self->cache->get($key) }
}

To distinguish this OO system from the (too) many others, such as Moose, Moo, Dios, Class::InsideOut, Mu, Spiffy, Class::Simple, Rubyish::Class, Class::Easy, Class::Tiny, Class::Std, and so on, I'm going to call this one "Cor" (short for "Corinna", a possibly fictional woman that the poet Ovid would write poems to). Using the name "Cor" is only for disambiguation. I hope Cor would become core and thus not need a name.

This document should be considered a "rough draft". While I (Ovid) am the initial author, this document has been heavily updated via feedback from Sawyer X and Stevan Little (and a bit from Matt Trout and Peter Mottram). Also, many of the underlying ideas have been directly "liberated" from Stevan's work.

Also, note that the intent is that this will ultimately be implemented in perl, not Perl. Thus, it would be written in C and likely be much faster than current options.

ASSUMPTIONS

In creating this proposal, I assumed the following:

  • No Implementation

    This document describes a possible OO system. It does not contain information about implementation. Further, general OO "details" about how roles work, how inheritance works, and so on, are mostly omitted.

  • Feature Compatibility

    We should not take away anything core Perl 5 supports. Thus, multiple-inheritance must be supported. I considered dropping it and saying "no, you have to use single inheritance", but we have a host of popular modules, such as Catalyst and DBIx::Class, which use MI and could thus not be easily ported were the authors ever inclined to do so.

  • Simplicity

    I strove to make this proposed syntax as simple as possible. This makes implementation easier and will have fewer grounds for objections.

  • Roles Must Be Included

    Most modern Perl 5 developers who use Moose/Moo use roles. Many of them use roles heavily. Thus, they will need to be supported.

  • Lexical Scope

    If possible, if the changes suggested can only apply to a given lexical scope, I suspect it will be easier to use the new classes with old code.

  • use v5.3X;

    This would be implemented as a feature and automatically be enabled if you use use v5.3X (or similar syntax). This would avoid having to jump through special hoops to use the new OO syntax. It would simply be there.

  • Safety

    Cor roles and classes assume strict and warnings by default. They also use subroutine signatures.

  • Hash References

    Assumes we use blessed hash references for the first pass. This may be revisited in the future.

  • Role Implementation

    Role implementation assumes Traits: The Formal Model (pdf) rather than the less formal Traits: Composable Units of Behavior (pdf) that is usually cited. The authors are the same, but the "Formal Model" is explicit about several assumptions made in the better-known paper.

TRIAL GRAMMAR

Below is a minimal and almost certainly incorrect grammar as a starting point for discussion.

(*
    cheating by allowing regexes and character classes
*)

Cor         ::= CLASS | ROLE
CLASS       ::=  DESCRIPTOR? 'class' NAMESPACE VERSION? DECLARATION BLOCK?
DESCRIPTOR  ::= 'abstract'
ROLE        ::= ‘role’ NAMESPACE VERSION? DECLARATION BLOCK?
NAMESPACE   ::= IDENTIFIER { '::' IDENTIFIER } VERSION?
DECLARATION ::= { PARENTS | ROLES } | { ROLES | PARENTS }
PARENTS     ::= 'isa' NAMESPACE  { ',' NAMESPACE }
ROLES       ::= 'does' NAMESPACE { ',' NAMESPACE }
IDENTIFIER  ::= [:alpha:] {[:alnum:]}
VERSION     ::= 'v' DIGIT '.' DIGIT {DIGIT}
DIGIT       ::= [0-9]
BLOCK       ::= # Work in progress. Described below

THE PROPOSAL

The bulk of this is to simply provide two things, classes and roles. The Cor syntax is deliberately simple and would be familiar to Perl 5/6 programmers, as well as programmers of other languages (single quotes imply exact text):

DESCRIPTOR 'class' NAMESPACE VERSION? DECLARATION BLOCK?
'role' NAMESPACE VERSION? DECLARATION BLOCK?
  • DESCRIPTOR

    Optional. Currently, if present, must be the keyword abstract which indicates a class that cannot be instantiated and must be subclassed.

  • 'class' or 'role'.

    One of class or role, indicating the type of this code. Required.

  • NAMESPACE

    The name (package) of the class or role. Follows current naming rules. Required.

  • VERSION

    A v-string identifying the version of this class/role. our $VERSION = inside of the BLOCK is also still allowed. Optional.

  • DECLARATION

    This will be described later, but essentially allows us to declare what classes, if any, we inherit from, and what roles, if any, we consume.

  • BLOCK

    The block of code defining the body of the class or role.

Only the 'class'/'role' and NAMESPACE and required:

class Person;
    ...
role Comparable;
    ....

If the BLOCK is not supplied the changes are file-scoped. Otherwise, they are block-scoped:

class Person     { ... }
role  Comparable { ... }

If possible, any other syntax changes suggested by this proposal would only apply to the scope of the BLOCK or file and be an error outside of the block.

Classes

In Perl 5, classes and packages are the same thing. While this has some drawbacks, it's worked reasonably well and we'll stick with this.

Basic Syntax

Cor introduces a new, simplified syntax:

class Dog v0.1 {
    method speak () { return 'Woof!' }
}

my $dog = Dog->new;
say $dog->speak;  # prints 'Woof'

Note there is no trailing semicolon required.

Alternatively, if no arguments are required, we can omit the parens with the method keyword:

class Dog v0.1 {
    method speak { return 'Woof!' }
}

Inheritance

Declaring inheritance is done via the isa keyword and takes a comma-separated list of class names (whitespace allowed). Some restrictions:

  • Cor

    You may only inherit from Cor classes as we cannot guarantee the behavior of non-Cor classes. This restriction may be removed in the future. However, for now we would prefer to maintain this restriction to avoid the possibility that Cor and non-Cor classes might need a different UNIVERSAL base class, thus altering their behavior.

  • C3

    C3 method resolution order is assumed.

      # cannot be instantiated
      abstract class Animal {
          # forward declarations are abstract methods
          # must method body must be defined by the time its called
          method speak;
      }
    
      class Dog isa Animal {
          method speak () { return 'Woof!' }
      }
    

In the above, Dog inherits from Animal.

By using whitespace to separate the classname from the version, we can also specify versions we require:

abstract class Animal v1.9 {
    method speak;
}

class Dog isa Animal v2.0 {
    method speak { return 'Woof!' }
}

The above should work according to current Perl 5 semantics (principle of least surprise).

We can also do:

class Kill::Me::Now isa I, Despise, Multiple::Inheritance { ... }

In the above, the class Kill::Me::Now inherits from I, Despise, and Multiple::Inheritance, in that order.

Role Consumption

Classes consume roles with the does keyword:

class My::Worker does Serializable, Runnable {
    ...
}

Of course, you can combine this with inheritance:

# obviously, if My::Worker consumes these roles, we do not need to repeat
# this here. This is only an example
class My::Worker::Fast isa My::Worker does Serializable, Runnable {
    ...
}

You may specify the does before the is:

class My::Worker::Fast does Serializable, Runnable isa My::Worker {
    ...
}

We don't envision supporting excluding or renaming role methods at the start, but please see the "Future Work" section.

Methods

Methods are accessed via the method keyword. Object slots (see below) are accessed via the self keyword. Methods use signatures, but a method with no arguments (aside from the invocant) may omit the signature:

method speak { say "Woof!" }

method allowed_to_vote (@people) {
    my @voters;
    foreach my $person (@people) {
        push @voters => $person 
          if self->is_on_voter_role($person);
    }
    return @voters;
}

Further:

class Foo {
    use List::Util 'sum';
    ...
    method dimsum() { ... }
}
Foo->new->sum;

The above would issue a runtime error similar to Can't find method 'sum' because the dispatcher would recognize sum as a subroutine, not a method. Further, roles would provide methods, not subroutines. This approach should eliminate the need for namespace::autoclean and friends.

Method dispatch would be resolved via the invocant class and the method name. The arguments to the method will not be considered.

Slots

Note: "slots" are internal data for the object. They provide no public API. By not defining standard is => 'ro', is => 'rw', etc., we avoid the trap of making it natural to expose everything. Instead, just a little extra work is needed by the developer to wrap slots with methods, thereby providing an affordance to keep the public interface smaller (which is generally accepted as good OO practice).

has SLOT    OPTIONS_KV;
has [SLOTS] OPTIONS_KV;

The basic slot declaration is simple;

has 'name';

By default, all slots are read-only and required to be passed to the constructor. Thus, to create an immutable point object:

class Point {
    has [ 'x', 'y' ];
    
    method to_string {
        # self->x and self->y are not available outside of this class
        return sprintf "[%d, %d]" => self->x, self->y;
    }
}

my $point = Point->new( x => 3, y => 7 );
say $point->x;            # fatal error
say $point->to_string;    # [3, 7]
Point->new( x => 4 ); # exception thrown because y is required

To provide a default:

has days_ago => ( default => method {20} );

Also, per conversation with MST, it's possible that all default slots should be automatically lazy.

Alternatively, if the default is a string, it's a method name to call (makes it easier to subclass):

has _dbh => ( default => '_build_dbh' );
method () _build_dbh { ... }

We should separate default and builder, yes? Anything with a builder would not be passed to the constructor.

Lazy slots (requires default):

has days_ago => (
    default  => method { ... },
    lazy     => 1,
);

We may wish to make the has function extensible, so that people can experiment with isa to manage their own types.

Exposing slot data requires writing a method:

class Point {
    has [qw/x y/];

    method x {self->x}
    method y {self->y}
}

Summary of slot options:

  • default => CodRef|Str (provide a default value if one is not supplied)
  • lazy => Bool (if default is provided, don't call it until it's asked for, Default is true?)
  • weaken => Bool (weaken the reference in the slot. Default is false)
  • optional => Bool (is this required by the constructor? Default is false) (unsure about this one)
  • rw => Bool (read-write, but only via the self keyword. Default is false)

Class Construction

new()

The new method would be in UNIVERSAL::Cor and should not be overridden in a subclass, though that will be allowed.. It would take an even-sized list of key/value pairs, omitting the need for a hashref:

Object->new( this => 1, that => 2 );      # good
Object->new({ this => 1, that => 2 });    # bad

BUILD

A BUILD method, just like Moose/Moo's BUILD method, will allow for additional customization:

method BUILD (%args) {
    unless ( self->verbose xor self->silent ) {
        # Speculative. We don't address exceptions in this proposal
        self->throw("You must specify one and only one of 'verbose' or 'silent' ");
    }
}

BUILDARGS

The BUILDARGS method is how Moose/Moo messes around with arguments to allow us to do things like write this:

Point->new( $x, $y );

Instead of this:

Point->new( x => $x, y => $y );

However, the BUILDARGS method has always been a bit clumsy. We don't yet have a proposal for this.

UNIVERSAL::Cor

In order to not paint ourself into a Corner (hey, I'm a papa. I can tell bad papa jokes), we should have a separate object base class which all Cor objects implicitly inherit from. At minimum:

abstract class UNIVERSAL::Cor v.01 {
    method new(%args)   { ... }
    method can ()       { ... }
    method does ()      { ... }
    method isa ()       { ... }
}

That mirrors the UNIVERSAL class we currently inherit from, but there's room for more:

abstract class UNIVERSAL::Cor v.01 {
    method new(%args)   { ... }
    method can ()       { ... }
    method does ()      { ... }
    method isa ()       { ... }

    # these new methods are merely being mentioned, not
    # suggested. All can be overridden
    method to_string ()    { ... }    # overloaded?
    method clone ()        { ... }    # (shallow?)
    method object_id ()    { ... }
    method meta ()         { ... }
    method equals ()       { ... }
    method dump ()         { ... }
    method throw($message) { ... }
}

This is still an open discussion. We don't want to pack too much into the API and cause developers pain, but there are so many "common" use cases for objects that we're tired of rewriting ad nauseum that, like many other programming languages, it might be reasonable to put them into the base class.

Opinions welcome. However, this is such a core (no pun intended) needs that understanding if we need a separate UNIVERSAL class for Cor should be decided before Cor is ready for prime time (see also, Stevan Little's UNIVERSAL::Object).

This, incidentally, is why Cor classes cannot inherit from non-Cor classes and vice-versa.

EXAMPLE

Before

Here's a simple LRU cache in Moose:

package Cache::LRU {
    use Moose;
    use Hash::Ordered;
    use namespace::autoclean;

    has '_cache' => (
        isa     => 'Hash::Ordered',
        default => sub { Hash::Ordered->new },
    );

    has 'max_size' => (
        default => 20,
    );

    sub set {
        my ( $self, $key, $value ) = @_;
        if ( $self->_cache->exists($key) ) {
            $self->_cache->delete($key);
        }
        elsif ( $self->_cache->keys >= $self->max_size ) {
            $self->_cache->shift;
        }
        $self->_cache->set( $key, $value );
    }

    sub get {
        my ( $self, $key ) = @_;
        $self->_cache->get($key)
    }

    __PACKAGE__->meta->make_immutable;
}

After

Here it is in Cor:

class Cache::LRU {
    use Hash::Ordered;
    has cache    => ( default => method { Hash::Ordered->new } );
    has max_size => ( default => method { 20 } );

    method set ( $key, $value ) {
        if ( self->cache->exists($key) ) {
            self->cache->delete($key);
        }
        elsif ( self->cache->keys > self->max_size ) {
            self->cache->shift;
        }
        self->cache->set( $key, $value );
    }

    method get ($key) { self->cache->get($key) }
}

Note that in the Cor version, any attempt do directly access the cache or max_size slots from outside the class via direct access is an error, though you can override them:

my $cache        = Cache::LRU->new( max_size => 100 );
my $hash_ordered = $cache->cache;    # fatal error

Roles

Basic Syntax

Role syntax is also simple and clear:

role Whiny {
    method whine($message) { ... }
}

class My::Class does Whiny {
    ...
}
My::Class->new->whine('some message');

Roles both provide and require methods. Any methods fully defined in the role body will be composed into the consuming class.

Any methods defined via a forward declaration are "required" to be provided by the consuming class or another role consumed by the same class.

role MyRole {
    method this;                 # class must provide this
    method that;                 # class must provide this
    method foo ($bar) { ... }    # this is provided
}

Like classes, roles may also have slots in the same matter as classes and those slots will be provided to the class. (What happens if the class defines that slot in a different way from the role? For example, if the role slot is read-write but the class slot is read-only, bugs are awaitin').

Of course, roles can consume other roles:

 role SomeRole does ThisRole, ThatRole { ... }

Strictly adhering to the concept that roles are guaranteed to be both commutative (the order of application doesn't matter) and associative (for a given set of roles, it doesn't matter which consumes what, so long as the final set is the same), the above is equivalent to:

 role ThatRole does SomeRole, ThisRole { ... }

And their aggregate behavior will be the same if one or more is consumes other roles and are in turn consumed by a class:

role ThatRole   does ThisRole { ... }
role SomeRole   does ThatRole { ... }
class SomeClass does SomeRole { ... }   # role-provided behaviors are identical

MOP

The MOP module is likely sufficient for our needs.

TODO

For the first pass, we need:

Clarify Syntax

Is the current proposed syntax acceptable? I argue that it is because I feel that it still feels "Perlish", while also feeling clean enough that developers from other languages will feel right at home.

Questions

Read-only slots with no values

This is problematic:

has 'x';

The above is a private, read-only slot with no value. Unless we require it to be passed to the constructor, it's useless and should possibly be an error. This is where our BUILDARGS work (or replacement) might come in handy. It would be good if behaviors are specified declaratively, rather than procedurally.

Class-Level Behaviors

How do we handle class data and methods?

Some argue that class data is a code smell. Fine: argue that all you want. Multiple inheritance is also a code smell, but that doesn't mean we can tell Perl developers "no." But the semantics of that can get tricky.

Class methods, however, are important. For example, you may very well want a factory class with an interface like this:

my $message = Message::Factory->create(@message_list);

Internally:

static method create (@list) {
    if ( 1 == @list ) {
        return Message::String->new( message => $list[0] );
    }
    else {
        return Message::Collection->new( messages => \@list );
    }
}

Inside that method, any attempt to call a method on the C keyword would be a syntax error. It could internally call other static (class) methods directly (?) without an invocant.

"Why only blessed hashes?"

For simplicity. We currently don't have a clear vision of how non-hashrefs can be done transparently with this. Falling back to core OO may be a solution for some. For those who know they need a blessed regex, they'll (hopefully) know enough about core OO to go ahead and run with scissors.

"Why don't we have method modifiers?"

Cor tries to be as small as possible to avoid overreach. That means "no modifiers" at this time. However, they cause an issue for roles.

Let's say a method returns the number 10. One role modifies that number by adding a 20% VAT, making the result 12. Another role modifies that to offer a discount of 3, making the result 9. However, if the discount is applied and then the VAT is added, the result is 8.4. Thus, a developer could sort the list of consumed roles and change the behavior.

In the original traits research, one of the issues they were trying to work around was the fact that inheritance order could change code behavior. Consumption of roles, however, were guaranteed to be both commutative (the order of application doesn't matter) and associative (for a given set of roles, it doesn't matter which consumes what, so long as the final set is the same). Method modifiers break this guarantee.

Future Work

I believe the initial core of Cor should be as simple as we can possibly make it to avoid too much up front work and possibly making mistakes that we cannot walk back. However, any good design of something that is both long-lasting and that we know will grow should at least be aware of future considerations. Otherwise, we might make it harder to address issues.

Types

This will wait until the core OO is there. Making has extensible might help.

Parameterized Roles

I have no suggested syntax for this, but they're extremely useful.

Role Exclusion and Renaming

Matt Trout argues, and I agree, that excluding role methods or renaming them is a code smell. However, if you don't have control over the role source code (downloading from the CPAN or being supplied by another software team), you don't always have the luxury of refactoring the code. Thus, we need to support excluding methods and renaming them.

The syntax for this is less clear at this time, but I envision something like this:

class My::Worker isa Some::Parent::Class
 does Serializable(
    excludes => ['some_method'],
    renames  => { old_method => 'new_method' } ) {

    method some_method { ... }

    method old_method (@args) { ... }
        # do something with @args
        return self->new_method(@args);
    }
}

Excluding or renaming a method automatically makes it a "required" method. This is because, even if you don't use them in your class, the role might use them internally.

This raises an issue. In the original Smalltalk traits papers, they made it clear that a role is defined by its name and the methods it supplies (methods are defined by their signature, not just the name). It's possible that someone might do this:

if ( $object->DOES('Serializable') ) {
    ...
}

At this point, we don't know if the $object class excluded any methods from the Serializable role. Thus, we don't know if any methods we expect from Serializable will conform to expectations. Thus, the naïve Does('Serializable') check may be wrong because merely having the role name isn't enough to know if the class exhibits the desired behavior.

In proper OO, the replacement methods should be semantically identical, even if they're doing different things. In reality, we know that these guarantees are often tossed out the window. I don't know that this is really a serious issue because I haven't been hit with this, but I also know that safety in building large scalable systems suggests avoiding pitfalls.

I do not have a recommendation for this, but I point it out so people can be aware of the background.

Runtime Role Application

I have no suggested syntax at this time, but this generally involves reblessing an object into an anonymous subclass which consumes the role or roles. Naturally, it's harder to guarantee object behavior, especially if several roles are applied at runtime in separate statements.

ACKNOWLEDGEMENTS

Stevan Little has been working on an object system for Perl for years. And given his background—including creating Moose and Moxie—and his constant research into a "better" way to write OO, he laid much of the groundwork for Cor.

I had been working on a pure-Perl implementation of Cor (because clearly we don't have enough object modules on the CPAN) and discussing it with the Pumpking, Sawyer, at the 2019 EU Perl Conference in Riga, when he said he wanted a spec, not an implementation.

And he's right: with P5P, there are plenty of implementors, but there's been no agreement about what should be implemented. So, working with Stevan and Sawyer, I've had to suffer the humiliation of them laughing at my amateurish mistakes, but it's made this document better as a result.

Any mistakes, of course, are mine.

UPDATE

Putting updates here so they can be easily spotted without consulting the history.

self as a keyword

What does this do?

method foo {
    some_external_function(self);
}

We can add a check on self to ensure that private slots cannot be accessed unless we're in a class or a subclass of ref self, but seems clumsy. Or we can tell developers "don't do that", but we all know what that means.

If we remove self, we need an easy way for the class to access its internal state. Lexical variables have also been proposed:

class Foo {
    has $x => ( rw => 1 );
    
    method bar ($new_x) {
        $x = $new_x;
    }
}

Feels unperlish to me, but hey, what do I know? :)

BUILDARGS

We're trying to figure out a better syntax.

@druud
Copy link

druud commented Sep 10, 2021 via email

@Ovid
Copy link
Author

Ovid commented Sep 12, 2021

To all responding, note that this gist is almost two years old. The modern work on this project is at https://github.com/Ovid/Cor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment