- Name
union_class_types
- Date Revised: 2019-08-05
- First Published: 2019-08-04
- Proposed Version: PHP 8.0
- Author: Mike Schinkel mike@newclarity.net
- Mailing List Thread: https://news-web.php.net/php.internals/106866
This is a strawman proosal as an addition to the Union Types v2 proposal from Nikita Popov.
This proposal introduces the concept of a special type of class called a
union
class which is created by adding a types
keyword followed by a
vertical bar separated list of types, just like found in Nikita's proposal.
Note: I know that mixed
is not a valid typehint but I use below anyway for clarity.
The benefits of this proposal over and above the union_types_v2 proposal include:
-
Addresses the scenarios where type aliases in Union Types v2 would be desired but it provides a more robust and type-safe alternative to the two general ways envisioned by that proposal.
-
Leverages exiting syntax and semantics of classes and objects requiring only a small amount of language change.
-
Full type safety when accessing typed values passed to functions, and anywhere else union instances are used.
-
No ambiguity of types except where specifically wanted, e.g.
value()
andsetValue()
-
Ability to capture into one variable the value passed from the caller and then pass to called functions with having to first dereference the typed value.
-
Ability to create, manipulate and pass around unioned value and use in contexts not yet envisioned by this the PHP language's implementors.
A very simple Number
union class would look like this:
class Number {
types int|float;
}
Any class defined with a types
keyword would automatically get at least at least six (6) methods:
- Three (3) specifically-named methods; e.g.
value()
,setValue()
andtype()
, and - At least two (2) methods, each with the name
to{Type}()
, one for each unioned type.
So for the Number
union the built-in methods would be:
public function type():string
public function value():mixed
public function setValue(mixed)
public function toInt():?int
public function toFloat():?float
If we added a string
type to the union there would be a sixth method:
public function toString():?string
If we unioned a class Foo
then there would be another method:
public function toFoo():Foo
If we unioned a namespaced class \Foo\Bar
then there would be yet another method:
public function toFoo_Bar():\Foo\Bar
There would also be a static method that would return an array of the types defined in the class:
public static function types():?string[]
And a more complex union class would look like this, where this shows how a union would be used:
class Number {
types int|float;
public function _construct(int|float $number) {
$this->setValue($number);
}
public function getInt(): int {
switch ( $this->type() ) {
case 'int':
return $this->toInt();
case 'float':
return intval($this->toFloat());
}
return 0;
}
public function getFloat(): float {
switch ( $this->type() ) {
case 'int':
return 1.0 * $this->toInt();
case 'float':
return $this->toFloat();
}
return 0.0;
}
}
Here are some examples using as an anonymous class:
function showNumber(new class{types int|float} $number) {
echo $number->value();
}
And then a shorthand I propose which would be the equivalent of the prior example:
function showNumber2(int|float $number) {
echo $number->value();
}
These functions would be called like so:
showNumber(123); // Prints 123
showNumber(1.23); // Prints 1.23
showNumber("123"); // Throws a type error
These functions would also accept a matching union type instance instead of automatically creating one when called:
$number = new Number(123);
showNumber($number); // Prints 123
When you call the typed build-in methods you either get the expected type, or null
.
echo $number->toInt(); // Prints 123
showNumber($number); // Prints 123
echo $number->toFloat(); // Prints (null) or alternately would throw a type error
echo is_null($number->toFloat()); // Prints 1 (meaning true)
Alternately these would only return the expected type and throw an error if the wrong type is used.
To change the internal value of a union you would pass a mixed
value to the instance method setValue()
:
echo $number->toInt(); // Prints 123
echo $number->type(); // Prints int
$number->setValue(123.45); // Assigns 123.45 into the union's internal provide value
echo $number->toInt(); // Prints 123.45
echo $number->type(); // Prints float
This proposal would not need any changes in the handling of return values beyond those already envisioned by Nikiti's v2 proposal. These could work as expected:
function Foo(): int|string {
return 1;
}
$i = Foo();
echo gettype($i); // Prints int
function Foo(): int|string {
return "abc";
}
$s = Foo();
echo gettype($s); // Prints string
And this would return the union class instance, as expected:
function Foo(Number $number): Number {
return $number;
}
$n = Foo();
echo gettype($n); // Prints Number
Properties would be definable just like in Union Types v2, or by using the union class name, such as the following:
class Building {
public Number $squareMeters;
public function __construct(Number $squareMeters) {
$this->squareMeters = $squareMeters;
}
}
class Building2 {
public int|float $squareMeters;
public function __construct(int|float $squareMeters) {
$this->squareMeters = $squareMeters;
}
}
Properties when assigned one of the union types would automatically instantiate a union class type:
$building = new Building(2500);
echo gettype($building->squareMeters); // Prints Number
echo $building->squareMeters->type(); // Prints int
$building->squareMeters = 5000.0;
echo gettype($building->squareMeters); // Prints Number
echo $building->squareMeters->type(); // Prints float
However when instantiating an anonymously declared union class, it would behave just like an anonymous class behaves:
$b2 = new Building2(2500);
echo gettype($b2->squareMeters); // Prints class@anonymous
echo $b2->squareMeters->type(); // Prints int
$building->squareMeters = 5000.0;
echo gettype($b2->squareMeters); // Prints class@anonymous
echo $b2->squareMeters->type(); // Prints float
An instance of a declared named union class should be able to be passed to a function declared to accept an anonymous union that contains when the list of unioned types an equivalent, e.g.:
$building = new Building(2500);
$b2= new Building2($building); // Accepts and creates $b2
However, the opposite should not be possible, for type safety:
$b2 = new Building2(2500);
$building= new Building($building); // Throws a type error.
On the other hand, both of these would be valid:
$b2 = new Building2(2500);
$buildingA= new Building($building->value()); // Accepts and creates $buildingA
$buildingB= new Building($building->ToInt()); // Accepts and creates $buildingB
When a child class is extended from a union class it is also a union class.
namespace MyApp;
class Number extends \Number {
public string $decimal_point = '.';
public string $thousands_sep = ',';
private mixed $original_type;
public int $decimal_places;
public function __construct(int|float|string $value) {
$this->original_type = $value->type();
if $value->type()!=='string' {
parent::__construct($value);
} else if (false!==($pos=strpos($value,$this->decimal_point))) {
$this->decimal_places = strlen($value)-$pos-1;
parent::__construct(floatval($value));
} else {
parent::__construct(intval($value));
}
}
public function toString():string {
return $this->original_type==='string' {
? (string)$this->value()
: null;
}
public function type():mixed {
return $this->original_type!=='string' {
? $this->type()
: 'string';
}
public function value():mixed {
return $this->original_type!=='string'
? $this->value()
: number_format($this->toString(),
$this->decimal_places,
$this->decimal_point,
$this->thousands_sep
);
}
}
To be fleshed out assuming the rest of this proposal gains traction.
-
Accepting parameters of one of the unioned types from the caller and transforming them to an instance of the union class within the function.
-
Providing the
type()
,value()
andsetValue()
methods as well as the->to*()
methods for the union class without requiring them to be implemented by the class designer. -
Automatically creating a new instance of a union class instance when
a. A value is passed to a function where
type1|type2|...|typeN
is declared as a type parameter but the full anonymous class was not defined; seeshowNumber2()
above as compared toshowNumber()
.b. A value it assigned to a property that has been declared to accept a union.
-
Provide an implied parent class so that the
parent::
method that would allow extending the methods built-in by the including thetypes
keyword.
I am sure there are edge cases, but I wanted to get this proposal into the discussion before the train left the station.
If you find any such edge cases please comment below — possibly providing any suggestions you may have — and I will do my best to address them.
This proposal does not contain any backwards incompatible changes as far as I am aware.
Some will (and should) see some similarities between this proposal and interface{}
types in GoLang. However, this proposal was influenced by its authors use of Go interfaces, it is not proposing to copy Go interfaces as Go and PHP are two significantly different languages.
@mindplay-dk — Thank you for taking the time to explain generics in detail.
However, understanding the concept of generics has never been my problem. I understand them perfectly. I even understood that I needed them back in the late 80s when I was programming in C before I knew they existed in C++.
What my concerns are with generics is reasoning about them, in practice. I can probably explain best with an analogy to high school algebra. I can solve an equation with one or two unknowns in my head. But to solve for 3 or more unknowns requires pencil and paper, at least for me.
So when I try to reason about code that uses Java-style generics I feel like I am trying to solve an equation with 4 unknowns in my head, and so I end up having to use pencil and paper to get my head around code, and for me that makes coding a lot more tedious and much less enjoyable.
For some reason the Go-style contracts make so much more sense to me. And although I have yet to program with them I feel like they will be much easier to reason about than code where I have to mentally translate type abstraction in every expression that uses them.
Maybe it is because I find languages that use keywords — like
type
— rather than symbols easier to reason about, or more likely because I tend to find structural typing easier to work with than nominal typing, and Go's contracts are more like structural typing than nominal typing, although Go's contracts are in fact named, they represent a set of capabilities which is what structural typing is about.The fact that I can think about a function accepting parameters each with a single (pseudo-)type makes it so much easier for me to reason about than having to reason about a function's logic that hinges on an abstraction.
I will readily admit that I have met many other programmers who can maintain more complexity in their head than I can. It is quite possibly they have higher IQs than me, I don't know. But what I do know is that whenever I am on control of the code I ruthlessly simplify the code so that it can be understood without maintaining a lot of details in one's head. And when I have to work with languages that use Java-style generics that infect many of the open-source libraries available, that control of being able to simplify code is taken away from me.
#fwiw