Skip to content

Instantly share code, notes, and snippets.

@nikic
Created May 5, 2016 17:44
Show Gist options
  • Save nikic/4e7b2010143b6caf1509a71c584f0376 to your computer and use it in GitHub Desktop.
Save nikic/4e7b2010143b6caf1509a71c584f0376 to your computer and use it in GitHub Desktop.
Variable types
--------------
Likely one of the most important points to understand when dealing with the PHP virtual machine, are the three distinct
variable types it uses. In PHP 5 TMPVAR, VAR and CV had very different representations on the VM stack, along with
different ways of accessing them. In PHP 7 have become very similar in that they share the same storage mechanism.
However there are important differences in the values they can contain and their semantics.
CV is short for "compiled variable" and refers to a "real" PHP variable. If a function uses variable `$a`, there will be
a corresponding CV for `$a`.
CVs can have `UNDEF` type, to denote undefined variables. If an UNDEF CV is used in an instruction, it will (in most
cases) throw the well-known "undefined variable" notice. On function entry all non-argument CVs are initialized to be
UNDEF.
CVs are not consumed by instructions, e.g. an instruction `ADD $a, $b` will *not* destroy the values stored in CVs `$a`
and `$b`. Instead all CVs are destroyed together on scope exit. This also implies that all CVs are "live" for the
entire duration of function, where "live" here refers to containing a valid value (not live in the data flow sense).
TMPVARs and VARs on the other hand are virtual machine temporaries. They are typically introduced as the result operand
of some operation. For example the code `$a = $b + $c + $d` will result in an opcode sequence similar to the following:
T0 = ADD $b, $c
T1 = ADD T0, $d
ASSIGN $a, T1
TMP/VARs are always defined before used, as such cannot hold an UNDEF value. Unlike CVs, these variable types *are*
consumed by the instructions they're used in. In the above example, the second ADD will destroy the value of the T0
operand and T0 must not be used after this point (unless it is written to beforehand). Similarly, the ASSIGN will
consume the value of T1, invalidating T1.
It follows that TMP/VARs are usually very short-lived. In a large number of cases a temporary only lives for the space
of a single instruction. Outside this short liveness interval, the value in the temporary is garbage.
So what's the difference between TMP and VAR? Not much. The distinction was inherited from PHP 5, where TMPs were VM
stack allocated, while VARs were heap allocated. In PHP 7 all variables are stack allocated. As such, nowadays the main
difference between TMPs and VARs is that only the latter are allowed to contain REFERENCEs (this allows us to elide
DEREFs on TMPs). Furthermore VARs may hold two types of special values, namely class entries and INDIRECT value. The
latter are used to handle non-trivial assignments.
The following table attempts to summarize the main differences:
| UNDEF | REF | INDIRECT | Consumed? | Named? |
-------|-------|-----|----------|-----------|--------|
CV | yes | yes | no | no | yes |
TMPVAR | no | no | no | yes | no |
VAR | no | yes | yes | yes | no |
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment