Skip to content

Instantly share code, notes, and snippets.

@beberlei
Last active April 14, 2020 10:29
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save beberlei/ad4fb72951bddafbf580e0e648a6a00d to your computer and use it in GitHub Desktop.
Save beberlei/ad4fb72951bddafbf580e0e648a6a00d to your computer and use it in GitHub Desktop.
Attributes v2 RFC
====== PHP RFC: Attributes v2 ======
* Version: 0.4
* Date: 2020-03-09
* Author: Benjamin Eberlei (beberlei@php.net), Martin Schröder
* Status: Under Discussion
* First Published at: http://wiki.php.net/rfc/attributes_v2
Large credit for this RFC goes to Dmitry Stogov whose previous work on
attributes is the foundation for this RFC and patch.
===== Introduction =====
This RFC proposes Attributes as a form of structured, syntactic metadata to
declarations of classes, properties, functions, methods, parameters and
constants. Attributes allow to define configuration directives directly embedded
with the declaration of that code.
Similar concepts exist in other languages named **Annotations** in Java,
**Attributes** in C#, C++, Rust, Hack and **Decorators** in Python, Javascript.
So far PHP only offers an unstructured form of such metadata: doc-comments.
But doc-comments are just strings and to keep some structured information, the
@-based pseudo-language was invented inside them by various PHP
sub-communities.
On top of userland use-cases there are many use-cases for attributes in the
engine and extensions that could affect compilation, diagnostics,
code-generation, runtime behavior and more. Examples are given below.
The wide spread use of userland doc-comment parsing shows that this is a highly
demanded feature by the community.
===== Proposal =====
==== Attribute Syntax ====
<nowiki>Attributes are a specially formatted text enclosed with "<<" and ">>".</nowiki>
attributes may be applied to many things in the language:
* functions (including closures and short closures)
* classes (including anonymous classes), interfaces, traits
* class constants
* class properties
* class methods
* function/method parameters
Attributes are added before the declaration they belong to, similar to doc-block comments. They can be declared **before** or **after** a doc-block comment that documents a declaration.
<code php>
<<...>>
<<...>>
function foo() {}
</code>
Each declaration of function, class, method, property, parameter or class
constant may have one or more attributes.
Each attribute may have values associated with it, but doesn't have to, similar
to how a constructor of a class works.
<code php>
<<WithoutArgument>>
<<SingleArgument(0)>>
<<FewArguments('Hello', 'World')>>
function foo() {}
</code>
The same attribute name can be used more than once on the same declaration.
Sementically the attribute declaration should be read as instantiating a class
with the attribute name and passing arguments to the constructor.
<nowiki>Note: As the "<<" and ">>" characters are used in an expression prefix
position here, there is no potential conflict for them being used in a
potential generics proposal, where "<T>" is the syntax commonly used in other
languages.</nowiki>
Since syntax is by far the most discussed point about this RFC, we also thought
of an alternative by introducing a new token for attributes (T_ATTRIBUTE)
defined as //@:// that the parser could look for. Choice of syntax will be a secondary
vote on the RFC.
<code php>
@:WithoutArgument
@:SingleArgument(0)
@:FewArguments('Hello', 'World')
function foo() {}
</code>
See discussion about alternative syntaxes below for more info why the most
requested syntaxes "@" and "[]" are not possible.
==== Attribute Names Resolve to Classes ====
The name of an attribute is resolved against all currently imported symbols
during compilation. This is done to namespace attributes and avoid accidental re-use of the
same attribute name by different libraries and applications.
<code php>
use My\Attributes\SingleArgument;
use My\Attributes\Another;
<<SingleArgument("Hello")>>
<<Another\SingleArgument("World")>>
<<\My\Attributes\FewArguments("foo", "bar")>>
function foo() {}
</code>
There are also benefits to declaring this attribute class in code:
* Reflection API can directly convert an attribute to an instance of this class (see "Reflection" section below)
* Static analysis tools can verify attributes are correctly used in your code
* IDEs can add support to autocomplete attributes and their arguments
Declaring an attribute classs for the above example looks like this:
<code php>
namespace My\Attributes;
use PhpAttribute;
<<PhpAttribute>>
class SingleArgument
{
public $value;
public function __construct(string $value)
{
$this->value = $value;
}
}
</code>
==== Compiler and Userland Attributes ====
This proposal differentiates between two different kinds of attributes:
* Compiler Attributes (validated at compile time)
* Userland Attributes (validated during Reflection API access)
A compiler attribute is an internal class that is attributed with the //PhpCompilerAttribute// attribute.
A userland attribute is an userland class that is attributed with the //PhpAttribute// attribute.
When a compiler attribute is found during compile time then the engine invokes
a validation callback that is registered for every compiler attribute.
For example the patch includes a validation callback for //PhpCompilerAttribute// that prevents
its use by userland classes:
<code c>
#include "zend_attributes.h"
void zend_attribute_validate_phpcompilerattribute(zval *attribute, int target)
{
if (target != ZEND_ATTRIBUTE_TARGET_CLASS) {
zend_error(E_COMPILE_ERROR, "The PhpCompilerAttribute can only be used on class declarations and only on internal classes");
} else {
zend_error(E_COMPILE_ERROR, "The PhpCompilerAttribute can only be used by internal classes, use PhpAttribute instead");
}
}
INIT_CLASS_ENTRY(ce, "PhpCompilerAttribute", NULL);
zend_ce_php_compiler_attribute = zend_register_internal_class(&ce);
zend_attributes_internal_validator cb = zend_attribute_validate_phpcompilerattribute;
zend_compiler_attribute_register(zend_ce_php_compiler_attribute, &cb);
</code>
The attribute zval contains all arguments passed and target is a constant that allows
validating the attribute is on the right declaration.
Userland classes can not use the //PhpCompilerAttribute//. An error is thrown if this happens.
<code php>
<?php
<<PhpCompilerAttribute>>
class MyAttribute
{
}
// Fatal error: The PhpCompilerAttribute can only be used by internal classes, use PhpAttribute instead
</code>
By mapping attributes to classes tools, editors and IDEs can provide both
syntactial and context information about the use of attributes to
developers.
The downside of this approach is that mistyped compiler attributes get
classified as userland attributes.
==== Constant Expressions in Attribute Arguments ====
Attribute arguments are evaluated as constant AST expressions, This means that
a subset of PHP expressions is allowed as argument:
<code php>
<<SingleArgument(1+1)>>
<<FewArguments(PDO::class, PHP_VERSION_ID)>>
</code>
The primary use-case why constant AST is allowed is the ability to reference
(class) constants. Referencing constants is desired because it avoids
duplicating information into attributes that already exists as a constant.
Another benefit is the potential for static verification by tools and IDEs to
validate attributes.
The constant AST is resolved to a value when accessing attributes with the Reflection API.
==== Reflection ====
The following Reflection classes are extended with the getAttributes() methods,
and return array of ReflectionAttribute instances.
<code php>
function ReflectionFunction::getAttributes(string $name = null, int $flags = 0): ReflectionAttribute[];
function ReflectionClass::getAttributes(string $name = null, int $flags = 0): ReflectionAttribute[];
function ReflectionProperty::getAttributes(string $name = null, int $flags = 0): ReflectionAttribute[];
function ReflectionClassConstant::getAttributes(string $name = null, int $flags = 0): ReflectionAttribute[];
</code>
The name argument can be used to retrieve only the attribute(s) of the given
attribute name or subclasses of the given name.
<code php>
$attributes = $reflectionFunction->getAttributes(\My\Attributes\SingleArgument::class);
</code>
When the flags parameter is not set, then getAttributes defaults to returning
only those attributes with the exact same name as given in the first argument.
When you pass the constant //ReflectionAttribute::IS_INSTANCEOF// then it returns
all attributes that pass an instanceof check with the passed class name.
<code php>
$attributes = $reflectionFunction->getAttributes(\My\Attributes\MyAbstractAttribute::class, \ReflectionAttribute::IS_INSTANCEOF);
</code>
The API of the new ReflectionAttribute looks like this:
<code php>
class ReflectionAttribute
{
public function getName(): string
public function getArguments(): array
public function getAsObject(): object
}
</code>
Because validation of attributes is only performed during
//ReflectionAttribute::getAsObject()//, it is technically not required to
declare the attribute class. You can still acccess name and arguments directly
from //ReflectionAttribute//.
Full example:
<code php>
namespace My\Attributes {
<<PhpAttribute>>
class SingleArgument {
public $argumentValue;
public function __construct($argumentValue) {
$this->argumentValue = $argumentValue;
}
}
}
namespace {
<<SingleArgument("Hello World")>>
class Foo {
}
$reflectionClass = new \ReflectionClass(Foo::class);
$attributes = $reflectionClass->getAttributes();
var_dump($attributes[0]->getName());
var_dump($attributes[0]->getArguments());
var_dump($attributes[0]->getAsObject());
}
/**
string(28) "My\Attributes\SingleArgument"
array(1) {
[0]=>
string(11) "Hello World"
}
object(My\Attributes\SingleArgument)#1 (1) {
["argumentValue"]=>
string(11) "Hello World"
}
**/
</code>
With this approach a call to //getAttributes()// never throws errors. This will
avoid problems when different libraries with different semantics are parsing
attributes on the same declaration.
===== Use Cases =====
==== Use Cases for PHP Extensions ====
One major use case for attributes will be PHP core and extensions.
HashTables with declared Attributes are available on every //zend_class_entry//, //op_array//,
//zend_property_info// and //zend_class_constant//.
PHP Core or extensions will want to check if certain declarations have an attribute or not.
One such example is the existing check for "@jit" in Opcache JIT that instructs the JIT
to always optimize a function or method.
With attributes it can be changed to the following C code in the extension:
<code c>
static int zend_needs_manual_jit(const zend_op_array *op_array)
return op_array->attributes &&
zend_hash_str_exists(op_array->attributes, "opcache\\jit", sizeof("opcache\\jit")-1));
}
</code>
Developers could then use an attribute instead of a doc-comment:
<code php>
use Opcache\Jit;
<<Jit>>
function foo() {}
</code>
==== Other potential core and extensions use cases/ideas ====
Below is a list of ideas. Please note that these are not part of this RFC.
Structured Deprecation of functions/methods. Almost all languages with
attributes have this built-in as well. One benefit having this in PHP would be
that it could allow deprecating classes, properties or constants, where
trigger_error cannot be used by developers at the moment.
<code php>
// an idea, not part of the RFC
use Php\Attributes\Deprecated;
<<Deprecated("Use bar() instead")>>
function foo() {}
</code>
A //Deprecated// attribute would have the benefit of allowing to deprecate properties and constants, which is currently not possible
using //trigger_error//.
<code php>
class Foo
{
<<Deprecated()>>
const BAR = 'BAR';
}
echo Foo::BAR;
// PHP Deprecated: Constant Foo::BAR is deprecated in test.php on line 7
</code>
Opt-in change of "legacy" behavior of PHP for example as proposed in
[[https://wiki.php.net/rfc/engine_warnings|Reclassify Engine Warnings RFC]] and
[[https://externals.io/message/108767#108767|Support Rewinding Generators]].
Rust has a
[[https://doc.rust-lang.org/reference/attributes/diagnostics.html#lint-check-attributes|similar
set of attributes]]. This but could be used for augmenting the "Editions" proposal a graudal path to more consistency.
<code php>
// an idea, not part of the RFC
use Php\Attributes\Deny;
use Php\Attributes\Allow;
<<Allow("rewind_generator")>>
function bar() {
yield 1;
}
<<Deny("undeclared_variables")>>
function foo() {
echo $foo;
// PHP Fatal error: Uncaught TypeError: Access to undeclared variable $foo
}
<<Deny("dynamic_properties")>>
class Foo {
}
$foo->bar; // PHP Fatal error: Uncaught Error: Invalid access to dynamic property Foo::$bar
</code>
Some (limited) form of macros [[https://doc.rust-lang.org/reference/conditional-compilation.html#the-cfg-attribute|similar
to Rust]] could be useful to include polyfill functions only in lower versions
of PHP. This helps libraries to conditionally declare code compatible with Opcache and preloading:
<code php>
// an idea, not part of the RFC
use Php\Attributes\ConditionalDeclare;
use Php\Attributes\IgnoreRedeclaration;
<<ConditionalDeclare(PHP_VERSION_ID < 70000)>> // gets removed from AST when >= 7.0
<<IgnoreRedeclaration>> // throws no error when already declared, removes the redeclared thing
function intdiv(int $numerator, int $divisor) {
}
</code>
A ZEND_API to provide the arguments of a single attribute or a list of all
attributes will be part of the final patch so that extension authors can
utilize attributes with as little effort as possible.
This API is a draft for now:
<code c>
/* Retrieve attribute arguments by attribute name */
HashTable *zend_attribute_get(HashTable *attributes, char *name, size_t name_len);
/* Retrieve all attribute arguments indexed by attribute name */
zval *zend_attribute_all(HashTable *attributes, char *name, size_t name_len);
</code>
==== Userland Use-Case: Declaring Event Listener Hooks on Objects ===
In userland attributes provide the benefit of putting declaration
and additional configuration directly close to each other.
This is an example of refactoring Symfony EventSubscribers to use
attributes instead. The //EventSubscriberInterface// requires
users to declare which event is handled by which method on the class
in the //getSubscribedEvents()// method.
This can be changed to just look for attributes on methods
to declare which event they listen to.
<code php>
// current code without attributes
class RequestSubscriber implements EventSubscriberInterface
{
public static function getSubscribedEvents(): array
{
return [RequestEvent::class => 'onKernelRequest'];
}
public function onKernelRequest(RequestEvent $event)
{
}
}
// refactor to:
<<PhpAttribute>>
class Listener
{
public $event;
public function __construct(string $event)
{
$this->event = $event;
}
}
class RequestSubscriber
{
<<Listener(RequestEvent::class)>>
public function onKernelRequest(RequestEvent $event)
{
}
}
// and the EventDispatcher to register listeners based on attributes:
class EventDispatcher
{
private $listeners = [];
public function addSubscriber(object $subscriber)
{
$reflection = new ReflectionObject($subscriber);
foreach ($reflection->getMethods() as $method) {
// Does this method has Listener attributes?
$attributes = $method->getAttributes(Listener::class);
foreach ($attributes as $listenerAttribute) {
/** @var $listener Listener */
$listener = $listenerAttribute->getAsObject();
// with $listener instanceof Listener attribute,
// register the method to the given Listener->event
// as a callable
$this->listeners[$listener->event][] = [$subscriber, $method->getName()];
}
}
}
public function dispatch($event, $args...)
{
foreach ($this->listeners[$event] as $listener) {
// invoke the listener callables registered to an event name
$listener(...$args);
}
}
}
$dispatcher = new EventDispatcher();
$dispatcher->addSubscriber(new RequestSubscriber());
$dispatcher->dispatch(RequestEvent::class, $payload);
</code>
==== Userland Use-Case: Migrating Doctrine Annotations from Docblocks to Attributes ====
One of the major cases to consider for any attributes/annotations RFC is the
potential migration of the wide spread Doctrine Annotations library towards a
possible attributes syntax.
PHP cores support for attributes should provide a foundation make userland
migrate from docblocks to attributes.
The primary behavior in this RFC that attempts this balancing act is the
requirement for namespaced attribute names.
Doctrine or any userland library can utilize the name filter with a parent class to fetch
only attributes they are interested in:
<code php>
namespace Doctrine\Annotations;
abstract class Annotation {
class TARGET_CLASS = 1;
class TARGET_PROPERTY = 2;
public $target = self::TARGET_CLASS;
final public function __construct(array $values = []) {
foreach ($values as $key => $value) {
$this->$key = $value;
}
}
}
class AnnotationReader
{
function getClassAnnotations(\ReflectionClass $reflection): array {
$doctrineAnnotations = [];
foreach ($reflection->getAttributes() as $attribute) {
// filter out any that doesn't extend Doctrine's annotation base class
if (!is_subclass_of($attribute, Annotation::class)) {
continue;
}
$annotation = $attribute->getAsObject();
// validate that doctrine annotation is on the right "target"
// getClassAnnotations requires all annotations to be allowed on a class
if ($annotation->target & Annotation::TARGET_CLASS === 0) {
throw new \RuntimeException(get_class($annotation) . " is not allowed on class.");
}
$doctrineAnnotations[] = $annotation;
}
return $doctrineAnnotations;
}
}
</code>
With this flexibility in the Reflection API, Doctrine (or any other userland
annotation/attributes library) can also enforce stricter rules for use of the
attributes by adding their own logic on top wihout PHP attributes getting in
the way.
[[https://github.com/RectorPHP/Rector|Migration tools such as Rector]] can help with userland migrating to attributes.
===== Criticism and Alternative Approaches =====
=== Alternative Syntax: Why not use @ or [] like other languages? ===
<nowiki>The "<<" and ">>" syntax is used because it is one of the few syntaxes
that can still be used at this place in the code that looks fairly natural. We
could use other symbols that are not yet used as prefix operators, but
realistically only "%" is a contender here that doesnt look completly weird.
Others included "|", "=" or "/".</nowiki>
<nowiki>Specifically "[]" or "@" are not possible because they conflict with
the short array syntax and error suppression operators. Note that even
something involved like the following syntax is already valid PHP code right
now:</nowiki>
<code php>
[[@SingleArgument("Hello")]]
</code>
It would require looking ahead past potentially unlimited tokens to find out if
its an array declaration or an attribute. We would end up with a context
sensitive parser, which would be an unacceptable outcome.
=== Why not extending Doc Comments? ===
Attributes are significantly better than docblock comments so that they warrant
being introduced as a new language construct for several reasons:
* Namespacing prevents conflicts between different libraries using the same doc comment tag
* Checking for attribute existance is a O(1) hash key test compared to unpredictable strstr performance or even parsing the docblock.
* Mapping attributes to classes ensures the attributes are correctly typed, reducing major source of bugs in reliance on docblocks at runtime.
* <nowiki>There is visible demand for something like annotations based on its common use in so many different tools and communities. However this will always be a confusing thing for newcomers to see in comments. In addition the difference between /* and /** is still a very subtle source of bugs.</nowiki>
While it might be possible to make PHP parse existing doc-comments and keep
information as structured attributes, but we would need to invoke additional
parser for each doc-comment; doc-comment may not conform to context-grammar
and we have to decide what to do with grammar errors; finally this is going
to be another language inside PHP. This solution would be much more complex
than introducing attributes and is not desired.
With attributes as proposed by this RFC, we re-use the existing syntax for
expressions and constant expressions. The patch to the core for this
functionality is small.
=== Why not always map attributes to simple arrays instead for simplicity? ===
The previous section already re-iterated on the benefits why a class name
resolving of attributes are important. Validation that attributes are
correct is one of the primary benefits over the previous approach
with doc-comments, where such validation is not possible.
=== Why not a stricter solution like Doctrine Annotations? ===
This RFC proposes only base PHP attribute functionality. A general solution for
PHP and the wider community must take different use-cases into account and the
full Doctrine like system is not necessary for a lot of use-cases, especially
the PHP internal use-cases.
=== Naming (attributes or annotations) ===
The name "Attributes" for this feature makes sense to avoid confusion with
annotations that are already used. With this distinction Doctrine Annotations
is implemented with either docblock (PHP 7) or attributes (PHP 8+).
===== Backward Incompatible Changes =====
None
===== Proposed PHP Version(s) =====
8.0
===== RFC Impact =====
==== To Core ====
Requirement to store attributes on every parsing token, ast nodes,
zend_class_entry, zend_class_constant, zend_op_array and zend_property_info
adds one additional pointer to each strucutre, even those that doesn't use attributes.
==== To SAPIs ====
None
==== To Existing Extensions ====
None
Opcache JIT will move to use Opcache\Jit instead of @jit and Opcache\Nojit
instead of @nojit attributes, but this is currently an unreleased feature.
==== To Opcache ====
opcache modifications are parts of the proposed patch, might not be working
100% after internal changes from original 7.1 patch to 8.0
==== New Constants ====
None
==== php.ini Defaults ====
None
===== Open Issues =====
* How to best differentiate between userland and compiler/engine attributes without requiring autoload during compile time?
===== Future Scope =====
* Integration with a potential named arguments proposal for function calls
* Opportunity to augment existing functionality with new behavior without
breaking backwards compatibility. One example is introduction of a
"//<<Rewindable>>//" attribute that could be used to signal that a
generator function creates a rewindable iterator.
* Add <<Deprecated>> attribute that emits deprecation when function/method
called, property or const accessed
* Other languages such as Go have simple but powerful serialization from
XML/JSON to objects and back. The combination of typed properties an
attributes puts this in reach for core or a PHP extension to implement.
* An alternative "short" syntax to declare attributes in one enclosing
//<<SingleArgument("foo"), MultiArgument("bar", "baz")>>// This could be
revisited in the future similar to grouped use statements being added after
use statements already existed.
* Extending userland attributes to allow declaring which target they are
allowed to be declared on including validation of those targets in
//ReflectionAttribute::getAsObject()//.
===== Proposed Voting Choices =====
* Accept PHP Attributes v2 into core? 2/3 majority
* Which syntax to use for attributes? "<<>>" or "@:"
===== Patches and Tests =====
Two patches that are based on each other, the second one implementing future scope and alternative syntax:
* https://github.com/beberlei/php-src/pull/2 (with //<<>>// syntax)
* https://github.com/kooldev/php-src/pull/2 (with //@:// syntax, including userland target validation)
===== References =====
* [[https://doc.rust-lang.org/reference/attributes.html|Rust Attributes]]
* [[https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/concepts/attributes/|C# Attributes]]
* [[https://en.wikipedia.org/wiki/Java_annotation|Java Annotation]]
* [[https://www.typescriptlang.org/docs/handbook/decorators.html|TypeScript/ECMAScript Decorators]]
* [[https://docs.microsoft.com/en-us/cpp/cpp/attributes?view=vs-2019|C++ Attributes]]
* [[https://golang.org/pkg/reflect/#StructTag|Go Tags]]
* [[https://docs.hhvm.com/hack/attributes/introduction|Attributes in Hack]]
Previously failed or abandoned RFCs
* [[https://wiki.php.net/rfc/attributes|Attributes v1]]
* [[https://wiki.php.net/rfc/annotations_v2|Annotations v2]]
* [[https://wiki.php.net/rfc/reflection_doccomment_annotations|Reflection Annotations using the Doc-Comment]]
* [[https://wiki.php.net/rfc/simple-annotations|Simple Annotations]]
* [[https://wiki.php.net/rfc/annotations-in-docblock|Annotations in DocBlock RFC]]
* [[https://wiki.php.net/rfc/annotations|Class Metadata RFC]]
===== Changelog =====
0.2:
* Added new, hopefully more simple userland example
* Changed //Php\Attribute// to //PhpAttribute// because the PHP namespace has not been reserved for PHP and is currently "empty"
* Clarify necessary order of docblocks, attributes and function declarations
* Clarify no conflict with potential generics syntax
0.3:
* Changed to support the same attribute multiple times on the same declaration
* Added support for attributes on method and function parameters
* Replaced //PhpAttribute// interface with an attribute instead
* Distiction between userland and compiler attributes and description when each of them gets evaluated/validated
* Reduce number of examples to shorten RFC a bit and expand the other examples instead
0.4:
* Changed validation of compiler attributes to use a C callback instead of instantiating object
* Offer alternative syntax "@:" using new token T_ATTRIBUTE
@tolry
Copy link

tolry commented Mar 4, 2020

🎉 awesome, sounds quite ambitious to get into php8? but I am no expert on that - would love to have this in 8.0 😀

It is possible to use the same attribute name more than once on the same declaration. This
is one change to the previous Attributes RFC where this was not possible.

why is this change needed?

The name argument can be used to retrieve only attributes with the given attribute name.

it might be practical to also allow filtering to be more flexibel, filtering all Attributes from a namespace, e.g. doctrine will have a bunch of different attributes

<code>
<<Jit>
function foo() {}
</code>

missing second '>' in Jit attribute

to selective disable the JIT for functions

I think this should be an adverb, so 'selectively'

doesnt look completly weird

typo: completely

Note that even somthing involved

typo: something

Checking for attribute existance

typo: existence

Autoloading attribute classes ensures the

typo: ensure

In addition the difference between /* and /** is still a very subtle source of bugs.

👍 only happened to me once or twice, but this can be really painful

without conflicts to simler approaches

typo: simpler

Should attribute classes be required to implement an interface
Php\Attribute?

👍 the IDE autocomplete alone would be enough reason for me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment