Skip to content

Instantly share code, notes, and snippets.

@Ovid
Last active March 1, 2020 12:44
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Ovid/c42fd0aee71ff57013afc6f3417d1324 to your computer and use it in GitHub Desktop.
Save Ovid/c42fd0aee71ff57013afc6f3417d1324 to your computer and use it in GitHub Desktop.
Cor attribute/slot declaration?

This is a rough draft of some thoughts I've had regarding Cor attributes declaration. Please leave your thoughts.

Part of the problem with the Cor object proposal for the Perl code is that we tended to use the semantics of has as declared in the Moose OO extension for Perl. Unfortunately, this function handles:

  • Data
  • Attributes
  • Types
  • Coercion
  • Delegation
  • Clearers
  • Predicates
  • Documentation
  • Constructor args
  • Default values
  • Overriding
  • … and more!

It makes it very, very easy to declare attributes for Perl objects, but it's trying to do too much and has the wrong defaults. Instead, I've been rethinking this tremedously, trying to find a way to keep the "ease of use", but making it easier to do the right thing.

There have been suggestions that we separate slot (data) declaration from the slot's attribute declaration, but I think this is a mistake. We have literally hundreds of modules on the CPAN which try to join the two together. If we break them back apart, people will try to put them back together and Cor will again cause fragmentation of approaches on how to build objects.

My idea is to turn has into a variable declarator similar to my. It would exist in the context of a Cor class block. Here's a minimal 2D point object with x/y attributes, defaulting to 0 each, and with directly immutable attributes (yes, it's a silly example):

class Point2D {
    has [qw/x y/] :optional = (0,0);
    
    # objects can mutate their own state
    method move ($dX, $dY) {
        $self->x($self->x + $dX);
        $self->y($self->y + $dY);
    }
}

The syntax is loosely:

has          ::= 'hash' TYPE SLOTS ATTRIBUTES DEFAULT ';'
TYPE         ::= # probably punting on this for an MVP
SLOTS        ::= '[' SLOT {SLOT} ']' | SLOT
SLOT         ::= SIGIL? IDENTIFIER
SIGIL        ::= '$' | '@' | '%' | '*'
IDENTIFIER   ::= [:alpha:] {[:alnum:]}
ATTRIBUTES   ::= { ':' IDENTIFIER }
DEFAULT      ::= '=' PERL_EXPRESSION

Slot behaviors:

  • Read-only by default
  • Defaults are lazy unless :immediate is provided
  • Have no accessor if a SIGIL is used

Current attributes:

  • :required: slot must have a value at object construction
  • :optional: slot may have a value at object construction
  • Neither :required or :optional: slot must not have a value at object construction
  • :immediate: default value is required and will be calculated at object construction
  • :weak: value is a weak ref.
  • :builder, :builder(name): A builder method (default: _build_$slot_name) will provide the default
  • :clearer, :clearer(name): Reset slot to undef or default value.
  • :predicate, :predicate(name): Test if slot has been set (but it may have been set to undef)
  • :rw: attribute is read-write (all classes are allowed to write their own data)
  • handles(@|%): delegation

Examples:

class Box {
    # all attributes are required to be passed to constructor
    has [qw/height width depth/] :required;

    # You can optionally name your box.
    has 'name' :optional :predicate(has_name);

    # cannot be set via constructor. Uses a lazy `_build_volume` method
    has 'volume' :builder;

    method _build_volumne {
        return $self->height * $self->width * $self->depth;
    }
}

Another example:

class Cache::LRU {
    use Hash::Ordered;

    # types probably won't be in v1

    # sigil means that this attribute has no accessor. Hash::Ordered object is
    # default
    has Hash::Ordered $cache :handles(get) = Hash::Ordered->new;

    # you may optionally pass in a max_size value to the constructor
    # you can also call $cache->max_size to read this value or
    # $cache->max_size($new_size) to mutate this value.
    has PositiveInt :optional :rw max_size = 20;
    
    # immediately record creation time
    has created :immediate = time;

    method set ( $key, $value ) {
        if ( $cache->exists($key) ) {
            $cache->delete($key);
        }
        elsif ( $cache->keys > $self->max_size ) {
            # need the while loop in case they reset max size to a lower value
            $cache->shift while $cache->keys > $self->max_size;
        }
        $cache->set( $key, $value );  # new values in front
    }
}
@Ovid
Copy link
Author

Ovid commented Feb 16, 2020

Shout out to some (not all) people who've been very helpful with the current work:

  • Sawyer X
  • Stevan Little
  • Matt Trout (mst)
  • Paul Evens (leonerd)
  • Grinnz
  • Matthew Persico

Update: And Aristotle (his real name!), who reminded me of the importance of backwards compatibility.

@perigrin
Copy link

It seems to me the default being the least common case of erroring on object construction is optimizing the wrong way (IME the least common choice for object construction, especially with “bare” accessor-less attributes by default). I would pick either of the other two choices as a default and make this :noinit (or whatever).

@Grinnz
Copy link

Grinnz commented Feb 16, 2020

This is great but unless I'm missing something, still doesn't address the common need of generating accessors?

@Ovid
Copy link
Author

Ovid commented Feb 16, 2020

@Grinnz: Thank you! That was due to a typo. I had written this:

  • Have no attribute if a SIGIL is used

But I meant this:

  • Have no accessor if a SIGIL is used

Thus, if you write this:

has $count;

You get private data.

But if you write this:

has 'count';

You get an accessor with it.

If you need that to be mutable:

has 'count' :rw;

Leaving off the sigil is what makes the difference.

@Ovid
Copy link
Author

Ovid commented Feb 16, 2020

@perigrin: I'm sorry, but I need an example of what you mean. I've read what you've written a couple of times and I don't understand.

@Ovid
Copy link
Author

Ovid commented Feb 17, 2020

To give a silly example:

class FIFO {
    # this is private
    has @queue;

    # this has a public, read-only accessor that must be passed into the constructor
    has 'name' :required;

    method add(@values) {
        push @queue => @values;
    }

    method next {
        return shift @queue;
    }

    method peek ($num = 1) {
        return @queue[0 .. $num-1];
    }

    method length {
        return scalar @queue;
    }
}

my $fifo = FIFO->new( name => 'silly queue' );
$fifo->add(qw/foo bar baz/);
say $fifo->length;      # 3
say $fifo->name;        # 'silly queue'
say $fifo->next;        # foo
say $fifo->next;        # bar
say $fifo->next;        # baz
say $fifo->next;        # undef
say $fifo->length;      # 0

Note that it is not possible to access the @queue slot outside of the class.

@perigrin
Copy link

So I see now better why you have the defaults the way you do, I'm still not sure I agree but I'll get back to that. I'm not sure how I feel about the transient nature of sigils in this proposal.

Moose culturally and syntactically tried to avoid exposing attribute slots (and all that comes with them) because they were trying to break people of the habit of treating the instance as "just a hash". One of the things I (eventually) liked about Stevan's recent proposals was that slot creation was "just creating a lexical". It made a lot of intuitive sense once you got your head around it "not being Moose", and injecting behaviors into the object were layered on top of that.

It seems to me that has $foo and has 'foo' both be valid but different is gonna be hard for different groups of people to wrap their heads around. In the case of people with some experience in legacy code has $foo being the new thing but suddenly doesn't generate an accessor will be (I'd bet hard credits) surprising. For example is has $foo :rw an exception? If so why, if not why not? Conversely people totally new to Perl suddenly seeing has 'foo' as a thing while my 'foo' gives an error is also going to be (take more of my hard credits) surprising. Especially since there is all this old code that has has foo => ( ...) all over it. Trying to keep consistency with Moose, in my opinion here, loses consistency with Perl[1].

This leads to my earlier issue which I explained badly (I was on a phone in the middle of a grocery store), the common case for most read-only attributes is to have data that is read after being initialized at object creation time. The "struct" style object:

class Device { 
    has serial_number :required; 
    has disks :optional; # array of disks
    has cpus :optional;  # array of CPUs
    has memory :optional; # array of RAM sticks
    has mac_address :optional;
    has ip :optional;
    has location;

    method validate_device { … }
}

This makes the (IME) common case require the most typing. It also means that Device->new(serial_number => $sn, location => $location) will blow up because it's nether :required nor :optional[2] but with no warning to that effect in the class definition. I was suggesting that you make either :required or :optional the default and make no constructor args require an attribute.

  class Device {  # making optional the "default"
    has serial_number :required; 
    has disks;
    has cpus;
    has memory;
    has mac_address;
    has ip;
    has location :no-constructor-arg;

    method validate_device { … }
}

Optional doesn't have to be the default[3], but given the way objects have thus far worked in Perl optional seems more ... useful. This would optimize (IMO) for the common case for how people work with objects (not just in Perl but in most of the languages I've used).

I like where you're going with the code-attributes on slot creation. I think that simplifying the entire system closer to where Stevan was going in Moxie makes more sense (at least to me):

has $foo; # read only + constructor arg
has $foo :rw; # read write + constructor arg
has $foo :no-constructor # read only + *no* constructor arg   
has $foo :bare # no accessor but has constructor arg
has $foo :bare :no-constructor # true private instance data 

Then attribute slots look like and are treated exactly like variables. With this system I'd probably re-arrange the defaults some:

 has $foo; # constructor + no accessor
 has $foo :ro; # constructor + read-only
 has $foo :rw # constructor + read-write
 has $foo :bare # no constructor + no accessor
 has $foo :bare :ro # no constructor + read-only
 has $foo :bare :rw # no constructor + read-write

IMO this is easier to explain to both new people and experienced people alike. There's no weird conditional values based on sigil presence, struct-style objects are still the default, and you have to explicitly state how you want to deviate from that default in the class definition so there's some nominal "self-documentation".

1: In that same vein does has [qw/height width depth/] :required; mean with sigils it's has [qw/$h $w $d/] :required? Wouldn't that be better just making it work like every other kind of variable declaration? has $h, $w, $d :required; and has qw(height width depth) :required? The only reason Moose made things an array ref there was it was because has() is just a function. I'm like 99% positive if they'd had access to the perl core like this Stevan and Yuval, would have made it work more like variable declaration instead.

2: In fact does location have any way to set the value here at all? it has no sigil so method update_location { location = shift() } doesn't look right … and method update_location { self->location = shift() } will make people think $obj->location = $location` should work too.

3: I personally would prefer "required" to be the default, it keeps with the WORM/Immutable object principle we should be encouraging people to adopt.

@perigrin
Copy link

In a totally unrelated to my last comment you seem to be missing a way to provide an anonymous method for the builder/clearer/predicates … not being able to take an anonymous sub where you'd take a method name seems at this point "un-Perlish". Might I suggest formally allowing them like so?

 class Device {
   has mac; 
   has ip :builder(method { lookup_ip_for(self->mac) });
}

@Ovid
Copy link
Author

Ovid commented Feb 18, 2020

Thanks for a lot of the great comments, folks. Keep 'em coming! It's great.

Right now, there's been a bit of discussion about how to handle "private" instance data. Here are some possibilities:

Choice 1

  • Leading underscores (just like we do now)

has _queue = [];

This implies an accessor and it gets clumsy

method add(@args) {
    push $self->_queue->@* => @args;
}

Choice 2

  • Twigils (e.g., a second punctuation mark)

has @!queue;

Introduces more "line noise", but it's easier to see in a method that it's not a normal lexical variable.

method add(@args) {
    push @!queue => @args;
}

Choice 3

  • A normal lexical variable name:

has @queue;

Less line noise and not surprising. Might not always be clear if we're dealing with instance data or not:

method add(@args) {
    push @queue => @args;
}

@perigrin
Copy link

I like choice 3. Currently the way we do this is to actually use a lexical.

package Fifo {
   my @queue = ();
   method add(@args) { push @queue => @args }
}

This eliminates the "I have to create a (semi-public) accessor just to access slot data" problem that is a serious complaint about Moose. The only downside I can see to it right now is we're taking a currently fairly obscure idiom, and making it much more common.

Choice 2 seems to be a solution in search of a problem. The context surrounding them in Raku simply doesn't (yet) exist in Perl. Adding them without that context puts an extra burden on explaining what's going on. For example has $!foo works does has $.foo; also work? What about $?FILE;? What does my $!foo; do? The answers to each of these is a maintenance burden no only on the scripts that use has $!foo but also on core to handle all the exceptions correctly.

Choice 1 has an issue that I've run into a bunch in Moose. It presumes that the only kind of data you can store in a slot is scalar data. This leads to de-referencing everything. It gets ugly and tiresome, and with some less experienced programmers who don't understand why you get subtle bugs. That said if Choice 3 proves to be "too simplistic" then this might be an easy enough solution to fall back to (borrowing a bit from your syntax above):

package Fifo {
   my $queue :ro('_queue') = [];
   method add(@args) { push $self->_queue->@* => @args }
}

@Grinnz
Copy link

Grinnz commented Feb 18, 2020

package Fifo {
   my @queue = ();
   method add(@args) { push @queue => @args }
}

An important difference from this is that this is a proposal for instance data, not package data.

@perigrin
Copy link

@Grinnz I realized that after I hit "comment" but my point stands, we already scope things this way, and this isn't (IMO) an extra burden to explain: "has is just like my except for instance scoped data".

@mschout
Copy link

mschout commented Feb 18, 2020

I like Choice 3 also.

@Ovid
Copy link
Author

Ovid commented Feb 20, 2020

There's been a lot of discussion about whether or not attributes should be assignable as lvalues: $self->attr = 'foo'.

This is extremely unlikely to be supported. While it looks appealing to many developers, and it's supported in other languages, it inhibits maintainability. Why? An attribute (especially for an immutable object) is calculated once and then the value is cached. A method, however, has its code run every time it's called. Quite often in maintaining code we find times that there are methods which can benefit from being converted to attributes, or attributes which need to be converted to methods. That's where the problem lies.

By having different syntaxes for attributes and methods, we create a maintenance nightmare. We also make our contracts more brittle. That's because when we switch syntaxes, we'd have to find every place with a $self->attr = 'foo' and switch it to something like $self->attr('foo') (or the other way around).

@Grinnz
Copy link

Grinnz commented Feb 20, 2020

I'll also add, it's probably easy enough to leave it up to other modules to implement such things on top of this.

@shadowcat-mst
Copy link

@Ovid wrt lvalue or not lvalue:

You're missing a trade-off here.

"Not lvalue" makes it easier to convert back and forth between objects and methods
"lvalue" makes it easier to convert back and forth between public and private

I don't immediately have a strong opinion as to which is more important, because, well, I've not really had 'private' before to play with, but I think it's worth acknowledging that you're picking one of two things to make easier, it's not a straight up "sugar versus maintainability" question

@cfedde
Copy link

cfedde commented Feb 24, 2020

I'm not sure I fully understand private vs public here. It sounds like the idea is to use a sigil to enable or disable external access to internal state. I'd rather see no externally visible internal state and the accessor syntax always be mediated by the class runtime. If an instance variable has no external visibility then it could be marked as :hidden and no accessors would be generated for it. I suppose in perl tradition there would also need to be some syntax that describes the generaton of accessors: :getter/setter, :functional, or :lvalue. If external access is really need then maybe :exposed could be added to the list of accessors to generate.

BTW it's a little confusing to stack definitions of object attributes with function attributes. At least in the introductory documentation is there a less ambiguous noun for internal state vs the one used for behavior flags?

@matthewpersico
Copy link

matthewpersico commented Feb 26, 2020

I've really got to stop reading this stuff on a remote device. Now that I can see it all, I vote Choice 3 - Sigils for privates. BTW, I just did a CTRL F on this presentation of the underling md file and the subsequent discussion and protected mode is not mentioned anywhere. Do we need to think about it?

@jhthorsen
Copy link

I think this is a step in the right direction, but it is quite confusing that has $foo is different from has foo. I think they should all use sigils, to make it easier to use from inside the class. Adding :public could then add an accessor “foo()”.

And can you please create a repo and a PR, so we can use emojis to show support? And also comment inside the PR on different parts.

@Ovid
Copy link
Author

Ovid commented Feb 29, 2020

@jhthorsen: we're moving house (so I'm packing) and I have to prep for going to Germany next week for a conference, so I'll be brief.

It turns out that having both has foo and has $foo was a mistake because, amongst other problems, it conflates slot declaration with accessor generation. However, here's a quick peek at some of what I'm currently working on. It's not perfect, but has $x now only declares an instance variable and I provide a "Moose" column to show the equivalent.

Note that this example doesn't show all combinations. :handles, :weak, and :predicate are not included because
they're allowed with anything. And because we strive for immutable objects, so :writer and :clearer should be code
smells and thus aren't represented in the table below (though they will exist). Thus, here are the most common declarations we expect:

New Constructor Attribute Moose
has $x; Yes No has x => ( is => 'bare', required => 1 )
has $x :reader; Yes Yes has x => ( is => 'ro', required => 1 )
has $x :optional; Optional No has x => ( is => 'bare' )
has $x = $default; Yes No has x => ( is => 'bare', default => $default, lazy => 1 )
has $x :no-constructor; No No has x => ( is => 'bare', init_arg => undef )
has $x :reader :optional; Optional Yes has x => ( is => 'ro' ) )
has $x :reader = $default; Yes Yes has x => ( is => 'ro', default => $default, lazy => 1 )
has $x :builder :optional; Optional No has x => ( is => 'bare', builder => '_build_x', lazy => 1 )
has $x :optional = $default; Optional No has x => ( is => 'bare', default => $default, lazy => 1 )
has $x :immediate = $default; Yes No has x => ( is => 'bare', default => $default )
has $x :reader :no-constructor; No Yes has x => ( is => 'ro', init_arg => undef )
has $x :builder :no-constructor; No No has x => ( is => 'bare', builder => '_build_x', init_arg => undef, lazy => 1 )
has $x :reader :builder :optional; Optional Yes has x => ( is => 'ro', builder => '_build_x', lazy => 1 )
has $x :no-constructor = $default; No No has x => ( is => 'bare', init_arg => undef, lazy => 1 )
has $x :reader :optional = $default; Optional Yes has x => ( is => 'ro', default => $default, lazy => 1 )
has $x :reader :immediate = $default; Yes Yes has x => ( is => 'ro', default => $default )
has $x :builder :optional :immediate; Optional No has x => ( is => 'bare', builder => '_build_x' )
has $x :optional :immediate = $default; Optional No has x => ( is => 'bare', default => $default )
has $x :reader :builder :no-constructor; No Yes has x => ( is => 'ro', builder => '_build_x', init_arg => undef, lazy => 1 )
has $x :reader :no-constructor = $default; No Yes has x => ( is => 'ro', init_arg => undef, default => $default, lazy => 1 )
has $x :builder :no-constructor :immediate; No No has x => ( is => 'bare', builder => '_build_x', init_arg => undef )
has $x :reader :builder :optional :immediate; Optional Yes has x => ( is => 'ro', builder => '_build_x' )
has $x :no-constructor :immediate = $default; No No has x => ( is => 'bare', init_arg => undef, default => $default )
has $x :reader :optional :immediate = $default; Optional Yes has x => ( is => 'ro', default => $default )
has $x :reader :builder :no-constructor :immediate; No Yes has x => ( is => 'ro', builder => '_build_x', init_arg => 'bare' )
has $x :reader :no-constructor :immediate = $default; No Yes has x => ( is => 'ro', init_arg => undef, default => $default )

And suggestions for a better name for :no-constructor are welcome. Since all slots are, by default, private (with attributes to open them up), :private doesn't quite seem right.

@jhthorsen
Copy link

I would remove “:optional”, and add “:required” instead since I mostly build default values for slots. Also, if a slot has a builder, then I think that should imply “:optional”. Or even better: Just make the slots without a builder or default value required.

I prefer :private, instead of :no-constructor.

@Ovid
Copy link
Author

Ovid commented Feb 29, 2020

Builder does imply optional. That's in my notes, but not reflected here. Oops!

And we have three states for constructor args: required, optional, and forbidden. Making "required" the default state helps with many "smart struct" type objects which are all about data. Even for those objects which are about "being experts", quite often you pass in several required args, so it seemed a sensible default. Not saying I'm stuck on this choice, however.

@Ovid
Copy link
Author

Ovid commented Feb 29, 2020

One thing I don't like about this proposal. We have two constructor attributes, :optional and :no-constructor (maybe renamed to :private) to define whether or not we should pass in values for certain slots (instance data). Lack of either of these means the slot data must be passed to the constructor. But having both :optional and :no-constructor should be an error (duh). But this is frustrating because I'm trying as hard as possible to ensure that we can't define object data in an invalid way. Having a single attribute for object construction would help. For example:

has $x :new;             # required
has $x :new(optional);   # optional
has $x :new(no);         # not allowed in constructor

But the above just looks sloppy. Having a single attribute with a parameter means I can't have two conflicting attributes. But I can't figure out the right attribute/parameter combinations that look "clean". Here's another awful suggestion:

has $x :constructor(yes);
has $x :constructor(no);
has $x :constructor(maybe);

But even that's not quite working because I can do this:

# required in the constructor but still has a default?
has $x :constructor(yes) = 3;

In the above example, we have a single attribute to define whether or not something must be passed to the constructor, but we have a useless default being declared.

This is a problem that languages like Java don't suffer from because their signature-based method overloading means that you can define constructors that do the right thing:

class Box {
    double width, height, depth;

    Box(double w, double h, double d) {
        width = w; height = h; depth = d;
    }

    Box(double len) {
        width = height = depth = len;
    }

    double volume() {
        return width * height * depth;
    }
}

In the above example, we have two constructors. One takes three doubles and one takes one double. I don't have to mess around with declaring instance data as :optional. In fact, in the second constructor, I'm passing in data that doesn't even map to a slot.

Since we're not going to get method overloading in Perl, we have to resort to some nasty hacks to allow different constructors (or manually create a constructor with a name other than new.

Thoughts?

@Ovid
Copy link
Author

Ovid commented Mar 1, 2020

Or even better: Just make the slots without a builder or default value required.

@jhthorsen Not sure if we can safely do that or else we have the problem where we might want a truly private value which doesn't have a builder or default value, but which is computed at some odd point in the code using data that's not available until that moment. And default values and builders don't accept arguments, so we can't pass that data along.

@jhthorsen
Copy link

That’s why I liked “:private”.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment