Skip to content

Instantly share code, notes, and snippets.

@ncoghlan
Last active February 16, 2018 02:10
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ncoghlan/a1b0482fc1ee3c3a11fc7ae64833a315 to your computer and use it in GitHub Desktop.
Save ncoghlan/a1b0482fc1ee3c3a11fc7ae64833a315 to your computer and use it in GitHub Desktop.
PEP 505 alternative: the "?it" symbolic reference

Using ?it as a statement local symbolic reference

Core concept

  • (?it=expr) is a new atomic expression for an "it reference binding"
  • subsequent subexpressions (in execution order) can reference the bound subexpression using ?it (an "it reference")
  • ?it is reset between statements, including before entering the suite within a compound statement (if you want a persistent binding, use a named variable)
  • for conditionals, put the reference binding in the conditional, as that gets executed first
  • to avoid ambiguity, especially in function calls (where it could be confused with keyword argument syntax), the parentheses around reference bindings are always required

Examples

None-aware attribute access:

value = ?it.strip()[4:].upper() if (?it=var1) is not None else None

None-aware subscript access:

value = ?it[4:].upper() if (?it=var1) is not None else None

None-coalescense:

value = ?it if (?it=var1) is not None else ?it if (?it=var2) is not None else var3

NaN-coalescence:

value = ?it if not math.isnan((?it=var1)) else ?it if not math.isnan((?that=var2)) else var3

Conditional function call:

value = ?it() if (?it=calculate) is not None else default

Avoiding repeated evaluation of a comprehension filter condition:

filtered_values = [?it for x in keys if (?it=get_value(x)) is not None]

Avoiding repeated evaluation for range and slice bounds:

range((?it=calculate_start()), ?it+10)
data[(?it=calculate_start()):?it+10]

Avoiding repeated evaluation in chained comparisons:

value if (?it=lower_bound()) <= value < ?it+tolerance else 0

Avoiding repeated evaluation in an f-string:

print(f"{?it=get_value()!r} is printed in pure ASCII as {?it!a} and in Unicode as {?it}"

A possible future extension would then be to pursue PEP 3150, treating the nested namespace as an it reference binding, giving:

sorted_data = sorted(data, key=?it.sort_key) given ?it=:
    def sort_key(item):
        return item.attr1, item.attr2

(A potential added bonus of that spelling is that it may be possible to make "given ?it=:" the syntactic keyword introducing the suite, allowing "given" itself to continue to be used as a variable name)

Alternative spellings considered

  • (?=expr) (binding) and ? (reference): cryptic to read, no obvious pronunciation
  • (?expr) (binding) and ?? (reference): cryptic to read, no obvious pronunciation
  • (that=expr) (binding) and that (reference): looks too much like function call keyword arguments and ordinary variable references
  • (?that=expr) (binding) and ?that (reference): that was the first pronoun considered, but the proposal switched to it to make the boilerplate lighter
@ncoghlan
Copy link
Author

ncoghlan commented Dec 13, 2017

Could potentially be extended to if & while statements through use of an "as" clause:

if (?it=pattern.match(data)) is not None as matched:
   ...

while (?it=pattern.match(data)) is not None as matched:
   ...

@ncoghlan
Copy link
Author

ncoghlan commented Dec 13, 2017

Also provides a lower level primitive to explain "and", "or", and comparison chaining:

# Explaining  "result = expr1 or expr2 or expr3"
result = ?it if (?it=expr1) else ?it if (?it=expr2) else expr3

# Explaining  "result = expr1 and expr2 and expr3"
result = ?it if not (?it=expr1) else ?it if not (?it=expr2) else expr3  

# Explaining  "result = a < calculation() < c"
result = (a < (?it=calculation())) and (?it < c)

@ncoghlan
Copy link
Author

ncoghlan commented Jan 8, 2018

The original idea used ?? as the symbolic reference and ? as the reference binding operator. This made the references a bit too heavy visually, and the latter syntax didn't strongly suggest a binding operation.

So I modified it to spell the symbolic reference as a single ?, and the binding operation as ?=.

I also updated the binding operation to be a new atomic element with required parantheses, as otherwise there were too many cases where the meaning would be ambiguous to the reader.

@ncoghlan
Copy link
Author

New revision that changes the proposed spelling to include that as part of the syntax, rather than having it be symbolic only.

@freakboy3742
Copy link

I like this a lot more than (?=)/?. I get why the pure form of that is problematic, but including an actual word in the grammar aids parsing by humans. The ? character still strikes me as slightly odd, because it has no analog with other Python grammar.

If I may throw one more line of spelling suggestions into the pile: {}= and {}. This allows for some consistency with f-strings - the {} spelling is familiar, although the use in non-string grammar is novel.

Following the f-string analog, it could be expanded to include an explicit 'that' name: {that}= and {that}

The grammar symbol could be literally {that}; or, the name provided could be used to allow multiple "that"s in a compound statement:

value = {that} if ({that}=var1) is not None else {tother} if ({tother}=var2) is not None else var3

@ncoghlan
Copy link
Author

{that} and similar variants run into the problem that that they're already valid syntax for a single-element set.

While the proposed use of ? mostly stems from the use of ??, ?., and ?[] in PEP 505, my other reasons for considering it are:

  • it reminds me of the use of "???" as a general placeholder for "figure this bit out later" in docs and pseudo code, which is relevant for the conditional use cases (where the "then" branch appears before the condition in the code, even though it's executed after).
  • allowing binding within an expression is genuinely new, so I'm OK with the idea of spending a symbol on it

@freakboy3742
Copy link

/me facepalms - of course it will clash with set notation. LCA has clearly taken it's toll on my brain :-)

I also see your point about this being a reasonable analog of "???". My objection is that Python has historically avoided the appearance of "line noise" in it's syntax. While I'm not sure I have any other suggestions, there is a part inside me that would rather see a new keyword rather than a symbol.

@ncoghlan
Copy link
Author

I realised that if we're going to use a pronoun to improve readability, it works just as well as that, and is not only half the number of characters, but also just generally much lighter as boilerplate (since h and a are pretty visually heavy characters).

@ncoghlan
Copy link
Author

I also added some new examples:

  • None-aware attribute access and subscripting from PEP 505
  • allowing f-strings to re-use a field definition

@ncoghlan
Copy link
Author

ncoghlan commented Feb 15, 2018

(Using this as a convenient place to record a few key concerns with PEP 505)

During the pre-3.7-beta discussions, Serhiy noted that all of the languages that include a "??" operator use "c ? a : b" as the syntax for their conditional expressions, making "c ?? b" a more natural shorthand for "c ? c : a".

After coming up with this alternative proposal, I also realised those languages share another characteristic: their or operator equivalents (typically ||) are defined as returning a boolean result, rather than as propagating the type of the operands.

Python is different: our or already propagates the type of the operands, and you have to wrap it in bool (either implicitly or explicitly) to coerce it to a boolean value.

The other big difference is that the languages with ?? by and large don't treat empty containers as being false: they either don't allow containers to be implicitly coerced to boolean values at all, or else they treat them as opaque references (where a nil or NULL is false, but even an empty container is true). (_Note: I need to double check this and make sure it's actually true for all the languages mentioned in the PEP)

This meant that PEP 505 ended up using the or vs ?? distinction to try to define a second form of implicit coercion to bool (one based on obj is not None rather than bool(obj)), leading to the extra complexity of PEP 532's "circuit breaking protocol" as a way to make that less arbitrary.

@ncoghlan
Copy link
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment