Skip to content

Instantly share code, notes, and snippets.

@fredemmott
Last active October 10, 2019 21:49
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save fredemmott/9db26f3d14652a510d56327c956f4a57 to your computer and use it in GitHub Desktop.
Save fredemmott/9db26f3d14652a510d56327c956f4a57 to your computer and use it in GitHub Desktop.
XHP ns token

Migration

These strategies are based on XHP being practically unusable in combination with namespace; assumptions are:

  • all code that uses XHP is in the root namespace
  • all code that defines XHP is in the root namespace

Required

  • temporarily change XHPClassNameToken to mangle differently, so that :foo:bar and \foo\bar (or class \foo\xhp_bar, depending on proposal) are considered equivalent
  • temporarily change allow class :foo:bar to define class \foo\bar {} (or class \foo\xhp_bar)
    • this does not affect resolution of names within the definition (example below)
  • once that's been supported for $n (>=1) releases, migrate with hackast/hhast, and remove old behavior

Resolution example

// TEMPORARY: this defines class \foo\bar or class \foo\xhp_bar
class :foo:bar {
  attribute MyAttr foo; // this references \MyAttr - *NOT* \foo\MyAttr
}

This is likely straightforward to implement, as explicitly declaring class Foo\Bar {} was accidentally supported in the initial release of HackC until explicitly banned.

Optional

  • Provide a flag to entirely preserve the existing "only works in root namespace" behavior and mangling. Deployments with this flag enabled would be unable to combine XHP with namespaces.
  • Never delete the flag to allow class :foo:bar -> class foo\bar {} (or class foo\xhp_bar {} mangling
    • ban this syntax outside of the root namespace
    • removes need for mass migration of existing code, but allows use of namespaced XHP for new use cases
    • not completely transparent: error messages would report class foo\bar instead of foo:bar
    • alternatively, when rendering class names in error messages, render foo:bar if the class is an XHP class; this trades consistency for existing code against consistency for new namespace-using XHP code (e.g. namespace MyFeature { class :foo {} } would be reported as :MyFeature:foo

Conflicting class names

In particular, the HTML classes such as class :p {} vs class P {} (FB: 36)

Potential approaches (likely using a mix):

  • make class names case-sensitive (probably not great from usability to have two classes with the same name except for case though)
  • rename conflicting classes
  • change conflicting classes to namespaces (e.g. P::foo() -> P\foo())
  • namespace codegen'ed classes (e.g. add Thrift\ prefix)
  • add use xhp type and- so <p> instantiates \XHP\HTML\p, but p is a reference to \P (potentially confusing: type name does not match used name; advantage: lets us make things like <a> undefined by default, to more strongly encourage <ui:link> instead at the typechecker level)
  • add concept of default XHP namespace: similar to the above; simpler & shorter autoimport list, does not have ability to discourage usage of specific elements

XHP Namespaces

I am not addressing migration/compatibility in this post. This is addressed in the documents linked in 'previous context'

Previous context

TL;DR:

It is currently impossible to use XHP with namespaces:

  • class :foo extends \:x:element{} is a parse error
  • <\foo:bar> is a parse error (and confusing with </foo:bar>

<\foo\bar>baz</\foo\bar> is the obvious way, but </\foo\bar> and <\foo\bar /> are unreadable and annoying to type. Aim for : to be used as a replacement for \ in XHP contexts.

Remaining problem

How should use type and similar work for XHP? I see this as two strongly coupled questions:

  1. how should XHP names be mangled?
  2. what namespace separator should be used in what contexts? Contexts include:
    • type specifier: function foo(:bar:baz $xhp) (current)
    • xhp constructor calls: <bar:baz> (current)
    • imports: use type :bar:baz; use namespace :foo; use namespace :foo:bar; (proposal 1)

With an example:

namespace MyNS {
  class MyClass {}
  class :myxhpclass {}
}

namespace {
  use type MyNS\MyClass;
  // and what for :myxhpclass?
  use type MyNS\{MyClass, /* ? */ };
}

Current situation

  1. :foo:bar-baz::class is xhp_foo__bar_baz
  2. no namespace separator works correctly with XHP

Preferred proposal: \ always, except for : in XHP constructor calls

  1. :foo:bar-baz::class is invalid as :foo:bar-baz is not a type. an XHP class bar-baz in the foo namespace is foo\bar_baz.
  2. XHP constructor calls use :; all other contexts use \.
    • type specifier: function foo(\bar\baz $xhp)
    • XHP constructor calls: <bar:baz> if use namespace bar is present or bar is relative to the current namespace, <:bar:baz> for relative to the root namespace.
    • imports: use type \bar\baz;, use namespace bar\baz;

Example

namespace foo {
  class MyClass {}
  // This defines `Foo\MyXHPClass` instead of `Foo\xhp_MyXHPClass`
  class :MyXHPClass extends \x\element {}
  // or, making the rule more absolute as proposed in final section:
  xhp class MyXHPClass extends \x\element {}
}

namespace foo\bar {
  class MyOtherClass;
  // Defines `foo\bar\MyOtherXHPClass`
  class :MyOtherXHPClass extends \x\element {}
}

namespace {
  use type foo\MyXHPClass;
  $_ = <MyXHPClass />; // this is fine

  use type :foo:MyXHPClass; // parse error

  use type foo\{MyClass, MyXHPClass};
  $_ = new MyClass(); // fine
  $_ = new <MyXHPClass />; // fine   

  use namespace foo\bar;
  $_ = new bar\MyOtherClass(); // works
  $_ = <bar:MyOtherXHPClass />; // works

  use namespace :foo:bar; // parse error

  // fully-qualified
  $_ = <:foo:MyXHPClass />;
  $_ = <:foo:bar:MyOtherXHPClass />;

  // no problem:
  use type Foo\MyXHPClass;
  function takes_foo_MyXHPClass(MyXHPClass $_);
  function takes_root_MyXCHPClass(\MyXHPClass $_);
  // parse error:
  function takes_root_MyXCHPClass(:MyXHPClass $_);

  // this defines class foo
  // This will conflict with any non-XHP class called foo
  class :foo {}

  // Long term: parse error
  // Short-term: defines class \foo\bar to provide a migration path
  //   This syntax will only be allowed in the root namespace, even
  //   short-term.
  class :foo:bar {}
}

Notes

  • completely unifies XHP and non-XHP Hack namespaces
  • class :foo {} and class foo {} define a class with the same name and collide
  • requires - to be permitted in type names (though may be banned in non-XHP decls)

Alternative roposal 1: : always for XHP, \ always for non-XHP

  1. :foo:bar-baz::class is foo\xhp_bar_baz
  2. anything relating to XHP uses :; this keeps XHP class names seeming like they have another symbol table (i.e. class :foo doesn't conflict with class foo)
    • type specifier: function foo(:bar:baz $xhp)
    • XHP constructor calls: <bar:baz> if use namespace bar is present or bar is relative to the current namespace, <:bar:baz> for relative to the root namespace. </:bar:baz> to close.
    • imports: use type :bar:baz;, use namespace :bar:baz;
namespace foo {
  class MyClass {}
  class :MyXHPClass {}
}

namespace foo\bar {
  class MyOtherClass;
  class :MyOtherXHPClass {}
}

namespace {
  use type foo\MyXHPClass;
  $_ = <MyXHPClass />; // tries to instantiate \xhp_bar_baz; not affected by non-XHP `use type`

  use type :foo:MyXHPClass;
  $_ = <MyXHPClass />; // \foo\xhp_MyXHPClass

  use type foo\{MyClass, MyXHPClass}; // second class is not defined.

  use namespace foo\bar;
  new bar\MyOtherClass(); // works
  $_ = <bar:MyOtherXHPClass />; // unaffected by use ns; this is \bar\xhp_MyOtherXHPClass;

  use namespace :foo:bar;
  <bar:MyOtherXHPClass>; // \foo\bar\xhp_MyOtherXHPClass

  // fully-qualified
  $_ = <:foo:MyXHPClass />;
  $_ = <:foo:bar:MyOtherXHPClass />;


  // Problem?
  use type :foo:MyXHPClass;
  function takes_foo_MyXHPClass(:MyXHPClass $_); // ?
  function takes_root_MyXCHPClass(::MyXHPClass $_); // ?
}

Notes

  • through mangling, effectively keeps XHP and non-XHP classes in non-colliding name tables (unless you call non-XHP classes xhp_foo...)
  • using the same symbol to specify 'this is XHP class' and to qualify names leads to confusing cases like the last few lines of the example

Alternative proposal 2: interchangeable always

Allow : in all contexts where \ is a namespace separator.

  1. :foo:bar-baz::class is \foo\bar_baz
  2. anything relating to XHP uses :; this keeps XHP class names seeming like they have another symbol table (i.e. class :foo doesn't conflict with class foo)
    • type specifier: function foo(:bar:baz $xhp) or function foo(\bar\baz $xhp) or function foo(:bar\baz $xhp)
    • XHP constructor calls: <bar:baz> or <bar\baz
    • imports: use type :bar:baz;, use namespace :bar:baz;, use type \bar\baz etc.

Example

use type :foo:bar;
new bar(); // creates a new \foo\bar
$_ = <bar />; // also creates a new \foo\bar

 // no problem:
use type :foo:MyXHPClass;
function takes_foo_MyXHPClass(MyXHPClass $_);
function takes_root_MyXCHPClass(\MyXHPClass $_);
function takes_root_MyXCHPClass(:MyXHPClass $_); // same thing

Notes

  • completely unified XHP and non-XHP namespaces
  • class :foo {} and class foo {} define a class with the same name and collide
  • requires - to be permitted in type names (though may be banned in non-XHP decls)
  • syntax ambiguities:
    • return types: function foo(): :bar:baz; this already exists, but could become more prevalent if not limited to returning XHP, which is rare
    • ternary operators
    • goto labels (would likely need removing from the runtime)
    • <<file::SomeAttr> attributes
  • unknowns: other uses of : token
  • ambiguities/other uses would mean hack would need to be whitespace-sensitive in more places
  • allows users to decide between preferred proposal or alt 3… or whatever else they want and enforce with linters

Alternative proposal 3: \ in XHP

  1. :foo:bar-baz::class is invalid as :foo:bar-baz is not a type. an XHP class bar-baz in the foo namespace is \foo\bar_baz.
  2. \ everyehere:
    • type specifier: function foo(\bar\baz $xhp)
    • XHP constructor calls: <bar\baz> if use namespace bar is present or bar is relative to the current namespace, <\bar\baz> for relative to the root namespace.
    • imports: use type \bar\baz;, use namespace bar\baz;

Notes

  • largely equivalent to the preferred proposal:
    • only difference is the token used in XHP constructor calls is not dicated
    • this proposal is more consistent, but has more 'eww’:
  • the main argument against this is "eww </\foo\bar> and <\foo\bar /> are icky". That said, everyone strongly agrees with that :p
  • completely unifies XHP and non-XHP Hack namespaces
  • class :foo {} and class foo {} define a class with the same name and collide
  • requires - to be permitted in type names (though may be banned in non-XHP decls)

Related question: replace class :foo with xhp class foo

Every proposal except for (1) require that a normal class called foo and an xhp class called foo in the same namespace collide - i.e. they require that use type foo\bar; works regardless of if bar if an XHP or non-XHP class.

It is likely to be confusing that class :foo and class foo collide; this is much clearer for xhp class foo {} and class foo {}.

Some marker is needed at the declaration, for example:

  • to autogenerate the constructor
  • to permit the use of attribute and children declarations
  • to permit the use of - in type names (or mangle it to _ like we currently do)

Suggested path

  • xhp class foo {}
  • Preferred proposal: : for XHP constructor calls, \ everywhere else.
  • these can/should be independently landed changes, but should aim for the period of time when we have one without the other to be as small as possible

Rationale

  • xhp class rationale covered in previous section
  • alternative proposal 3 (\ everywhere) is largely equivalent to the preferred proposal (: at XHP constructor calls, \ everywhere else), but less readable and with more churn needed. Drop.
  • alternative proposal 1 has syntax ambiguities in type names. Drop.
  • This leaves alternative proposal 2 (interchangeable tokens) and and the preferred proposal. Go for preferred proposal as:
    • we generally prefer having one way to do things; if there are multiple, we'd still need to decide on one approach to use consistently in FB codebases and docs.hhvm.com. Likely that whatever we use there will be the de-facto standard.
    • smaller chance of side-effects
@jjergus
Copy link

jjergus commented Oct 8, 2019

This is also a valid option I think:

Proposal 5: Kill \ and use : for both XHP and non-XHP classes and namespaces everywhere

We almost certainly wouldn't want to go from the current state to 5 immediately, but if we agreed 5 is the best possible final state (IMHO yes but there are certainly valid arguments against it), then any option that gets us closer to that would be preferred. My suggested path:

  • transition from current state --> 3 right now (now : and \ express the same thing but we have rules about which one to use where)
  • 3 --> 2 later (now : and \ can be used interchangeably, which is a good migration step)
  • finally 2 --> 5 (now \ is gone and : is used everywhere, no more ambiguity, no more arbitrary rules, victory!)

we generally prefer having one way to do things

I agree in general, but my counter-argument here would be that IMHO it's better to say : and \ are equivalent and it's a matter of convention which one you decide to use in which context (proposal 2), than to say : and \ express the same thing but you have to use : in these contexts and \ in these other contexts (proposals 1 and 3).

@fredemmott
Copy link
Author

I'd be happy to keep that on the future pile, but given that:

  • (3) and (2) are both intermediate steps
  • we'd be extremely unlikely to go down that path until we're generally looking at non-semantic syntax changes in Hack
  • it affects namespaces in hack as a whole, so is likely to be affected by the current work on packages/environments

... it seems unlikely that it would have any concrete difference in what is implemented for 1-2 years, so seems better to leave as a separate proposal for later.

@fredemmott
Copy link
Author

Edited to add:

  • (4) vs (3) is small inconsistency vs 'eww'
  • (2) allows users to choose (3) or (4) or whatever they want and enforce via linters

@fredemmott
Copy link
Author

if we do (2), we'll effectively choose (3) vs (4) for FB codebases and docs.hhvm.com and xhp-lib; whatever we choose there will likely be the de-facto standard. I don't see the choice really existing once we've done that, and it carries a higher risk of migration pain

@fredemmott
Copy link
Author

fredemmott commented Oct 8, 2019

There is another:

Fix all the parse errors, change nothing else.

i.e.

namespace html {
  class :element extends \:x:element {}
  // defines \html\xhp_body extending \html\xhp_element
  class :body extends :element {
  }
}

namespace herpderp{
  $_ = <\html:body />;
  use namespace html;
  $x = <html:body />; // error: references herpderp\xhp_html__body;
  $x = <html\:body />; // works
  use type \html\:body;
  $x = <body />; // works;
}

@fredemmott
Copy link
Author

Just putting that for completeness; unless someone wants to advocate for it, I don't plan on fleshing it out.

@jjergus
Copy link

jjergus commented Oct 8, 2019

if we do (2), we'll effectively choose (3) vs (4) for FB codebases and docs.hhvm.com and xhp-lib; whatever we choose there will likely be the de-facto standard. I don't see the choice really existing once we've done that, and it carries a higher risk of migration pain

I think many people could reasonably choose (5) use : everywhere, or at least in more places than we choose for FB/xhp-lib (e.g. typehints). IMHO it would be really nice to have this option.

It might actually be pretty reasonable for many people to adopt some variation of (1) as their convention (i.e. use : as much as possible when dealing with XHP, not really having to worry about the ugly edge cases of (1) because any syntax works and does the same thing).

I doubt (4) (use \ everywhere) would be particularly popular so really the question boils down to "do we want to restrict where exactly : can be used or allow it everywhere?"

@fredemmott
Copy link
Author

fredemmott commented Oct 8, 2019

For reference, HHAST has found these places using token:: (the : token - all tokens are prefixed with token:):

  • anonymous functions (probably return type)
  • case and default statements
  • closures (probably return type)
  • ternary operator
  • enums (enum foo : int as int)
  • file attributes (<<file:foo>>)
  • function decl headers (return type)
  • goto labels: I don't think these can reasonably be disambiguated. Might be a blocker
  • lambdas (return type)

To disambiguate with ::, this will mean we'll be whitespace-sensitive in a lot more places, and we'll have harder-to-see bugs, even if the usage of : as a namespace separator is banned in a codebase, just due to it accidentally being used where :: is required.

PHP also has a relatively common pattern of class Foo {} and namespace Foo { /* implementaiton details of \Foo here */ }, but I don't think that's really carried over to the hack ecosystem

@fredemmott
Copy link
Author

<<file:foo>> currently can be solved by whitespace, but we would also need a solution for future @file:foo syntax

@fredemmott
Copy link
Author

Edited to:

  • move preferred proposal to top
  • renumbered others to 'alternative proposal '
  • add more examples covering the general problem, rather than the specific problem

@fredemmott
Copy link
Author

namespace {
  class :foo:bar {} // \foo\bar
}
namespace herp {
  class :herp:derp {} // \herp\herp\derp
}
namespace {
  class :herp:herp:derp {} // conflicts with previous class name
}

@fredemmott
Copy link
Author

fredemmott commented Oct 10, 2019

namespace {
   class :foo:bar extends :x:element {} // \x\element
}
namespace foo {
  class :bar extends \x\element {} // ?
  class :bar extends ::x:element {} // ?
}

@fredemmott
Copy link
Author

fredemmott commented Oct 10, 2019

namespace x {
  class :element extends base {}
  class :frag extends base {}
}
namespace foo {
  class :bar-baz {}
  class :baz extends bar-baz {}
  function foo(bar-baz $y): void {}
}

@dlreeves
Copy link

$cond ? $x as bar:foo : baz

@fredemmott
Copy link
Author

- class :foo:bar extends :x:element {}
+ xhp class foo:bar extends \x\element {}

@fredemmott
Copy link
Author

- class :foo:bar-baz extends :x:element {}
+ xhp class foo:bar_baz extends \x\element {}

@ssandler
Copy link

Discussion:

$cond ? $x as bar:foo : baz

Would be a parse error, would need to be:

$cond ? $x as bar\foo : baz

or, if currently in another namespace

$cond ? $x as \bar\foo : baz

Also confirming, we want this to work:

namespace foo;
use type \x\{element, frag};

xhp class bar extends element {
  public function render(): XHPRoot {
    return <frag />;
  }
}

@fredemmott
Copy link
Author

codemod practically: XHP_class_name token is seen at:

  • classish_declaration.classish_name
  • member_selection_expression.member_name (?)
  • postfix_unary_expression.postfix_unary_operand (?)
  • xhp_children_*
  • places that also accept a qualified name

(based on hhast data)

@fredemmott
Copy link
Author

member_selection_expression.member_name:

class :foo {
  attribute string myattr;
  public function bar(): void {
    var_dump($this->:myattr);
  }
}

$this->:myattr is considered an XHP_class_name in a member selection expression, this would need changing

@fredemmott
Copy link
Author

postfix_unary_expression.postfix_unary_operand:

class :foo {
  children (:bar*);
}

^ :bar* is a postfix unary expression with :bar as the operand and '*' as the operator

These feel like parser quirks that can be fixed rather than blockers

@fredemmott
Copy link
Author

need to investigate if children -> \ is a reasonable codemod or not, otherwise we have the same :foo:bar namespace\foo\bar vs \foo\bar problem

@fredemmott
Copy link
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment