July TC39 meeting notes, day 2

Brendan Eich
Thu Aug 4 14:29:51 PDT 2011
https://mail.mozilla.org/pipermail/es-discuss/2011-August/016188.html
A week late, I missed some of the morning due to a conflict. Thanks to Alex Russell for whiteboard photos. Others in attendance, please fill in and correct as needed. Thanks.

== Overview of initial working draft for 6th edition and discuss work flow for developing 6th edition draft ==

Allen presented the draft 6th edition and how best to develop it:

http://wiki.ecmascript.org/doku.php?id=harmony:specification_drafts

The phrase "extended code" as a way of specifying semantics for Harmony above and beyond "strict mode code" received some discussion (I missed most of it). In the end no one had a better term than "extended".

Allen also presented the use of cover grammars ("Supplemental Syntax") to specify destructuring assignment and possibly other syntactic extensions. The problem here is that an LR(1) grammar cannot distinguish an object or array literal from a destructuring pattern of the same form, until parsing reaches the '=' after the pattern. GLR or ordered choice top-down parsing techniques can cope, but LR(1) and therefore LL(1) cannot -- such an LR(1) grammar is ambiguous.

Note that ES1-5 cope with the existing ambiguity where x.y.z and x.y.z = w both start with a member expression by pushing the ambiguity off to the semantics, creating a spec-internal Reference type, which remembers the base object (what x.y evaluated to) and property name ('z'), and only getting the value (GetValue) if the x.y.z expression is an rvalue, otherwise using the Reference type lvalue  to assign to the named property (PutValue).

Computing a "Reference tree" or "minimal AST" for whole and arbitrarily large literal patterns, to defer evaluation till an assignment operator is parsed and we know the pattern is a destructuring lvalue rather than an object/array literal rvalue (after which GetValue or PutValue would process the tree according to its rvalue or lvalue nature) is not feasible.

This is due to the generality of the PropertyAssignment and  ElementList productions' AssignmentExpressions, which may embed function expressions and thus most of the grammar. We do not want to add an explicit and nearly-complete parse tree or AST to the spec.

The committee seemed to agree that the cover grammar approach seems like the best technique.

== Review/resolve open issues and change requests for 6 edition ==

https://bugs.ecmascript.org/buglist.cgi?order=Importance&list_id=384&field0-0-0=flagtypes.name&resolution=---&query_format=advanced&type0-0-0=substring&value0-0-0=TC39&product=Draft%20for%206th%20Edition

(Bug rows from the above query follow, starting with bug numbers.)

145 nor Normal All allen at wirfs-brock.com CONF --- eliminate uint32 length restriction the the length of array objects.
146 nor Normal All allen at wirfs-brock.com CONF --- Array generic array methods should not ToUint32 covert the length of non-generic arrays

We deferred these, agreeing that they seem worth trying to make as relaxations of existing index/length semantics for arrays, to align with String and avoid bogus uint32-domain work that cannot handle the "overflow" case of length == 2^32. They will change edge-case behavior. The changes may break only testsuites, but you never know.

178 nor Normal All allen at wirfs-brock.com CONF --- Must settle scoping details for block-scoped bindings

Much discussion here. The issue is whether let and const bindings hoist to block top, or start a new implicit scope (the let* or, let's call it, C++ rule). The prior work was nicely diagrammed by Waldemar in:

https://mail.mozilla.org/pipermail/es-discuss/2008-October/007807.html

Quoting from Waldemar's message (note the future-proofing for guards):

--- begin quote ---

There are four ways to do this:
A1. Lexical dead zone.  References textually prior to a definition in the same block are an error.
A2. Lexical window.  References textually prior to a definition in the same block go to outer scope.
B1. Temporal dead zone.  References temporally prior to a definition in the same block are an error.
B2. Temporal window.  References temporally prior to a definition in the same block go to outer scope.

Let's take a look at an example:

let x = "outer";
function g() {return "outer"}

{
  g();
  function f() { ... x ... g ... g() ... }
  f();
  var t = some_runtime_type;
  const x:t = "inner";
  function g() { ... x ... }
  g();
  f();
}

B2 is bad because then the x inside g would sometimes refer to "outer" and sometimes to "inner".

A1 and A2 introduce extra complexity but doesn't solve the problem.  You'd need to come up with a value for x to use in the very first call to g().  Furthermore, for A2 whether the window occurred or not would also depend on whether something was a function or not; users would be surprised that x shows through the window inside f but g doesn't.

That leaves B1, which matches the semantic model (we need to avoid referencing variables before we know their types and before we know the values of constants).

--- end quote ---

In the September 2010 meeting, however, we took a wrong turn (my fault for suggesting it, but in my defense, just about everyone did prefer it -- we all dislike hoisting!) away from hoisted let and const bindings, seemingly achieving consensus for the C++ rule.

Allen, it turned out, did not agree, and he was right. Mixing non-hoisting (the C++ rule) with hoisting (function in block must hoist, for mutual recursion "letrec" use-cases and to match how function declarations at body/program level hoist) does not work. In the example above, g's use of x either refers to an outer x for the first call to g() in the block, but not the second in the block (and various for the indirect call via f()) -- dynamic scope! -- or else the uses before |const x|'s C++-style implicit scope has opened must be errors (early or not), which is indistinguishable from hoisting.

So at last week's meeting, we finally agreed to the earlier rules: all block-scoped bindings hoist to top of block, with a temporal dead zone for use of let and const before *iniitalization*.

The initialization point is also important. Some folks wondered if we could not preserve var's relative simplicity: var x = 42; is really var x; x = 42, and then the var hoists (this makes for insanity within 'with', which recurs with 'let' in block vs. 'var' of same name in inner block -- IIRC we agreed to make such vars that hoist past same-named let bindings be early errors).

With var, the initialization is just an assignment expression. A name use before that assignment expression has been evaluated results in the default undefined value of the var, assuming it was fresh. There is no read and write barrier requirement, as there is (in general, due to closures) for the temporal dead zone semantics.

But if we try to treat let like var, then let and const diverge. We cannot treat const like var and allow any assignment as "initialization", and we must forbid assignments to const bindings -- only the mandatory initializer in the declaration can initialize. Trying to allow the "first assignment to a hoisted const" to win quickly leads to two or more values for a single const binding:

{
  x = 12;
  if (y) return x;
  const x = 3;
  ...
}

The situation with let is constrained even ignoring const. Suppose we treat let like var, but hoisted to block top instead of body/program top, with use before set reading undefined, or in an alternative model that differs from var per temporal dead zone, throwing. So:

{
  print(x);
  x = 12;
  let x;
}

would result in either print being called with undefined or an error on the use of x before it was set by the assignment expression-statement -- those are the two choices given hoisting.

But then:

{
  x = 12;
  print(x);
  let x;
}

would result in either 12 being printed or an error being thrown assigning to x before its declaration was evaluated.

Any mixture of error with non-error (printing undefined or 12) is inconsistent. One could defend throwing in the use-before-assignment case, but it's odd. And throwing in both cases is the earlier consensus semantics of temporal dead zone with a distinct state for lack of initialization (even if the initialization is implicit, e.g., in a declaration such as let x; being evaluated). Here "initialization" is distinguished from assignment expressions targeting the binding.

Trying to be like var, printing undefined or 12, is possible but future-hostile to guards and gratuitously different from const:

{
  x = 12;
  const G = ...;
  let x ::G = "hi";
}

We want to be future-proof for guards, and even more important: we want to support *refactoring from let to const*. Ergo, only temporal dead zone with its barriers is tenable.

There remains an open issue: without closures obscuring analysis, it is easy to declare use before initialization within the direct expression-statement children of a given block to be early errors, rather than runtime errors:

{
  x = 12;          // can be early error
  print(x);        // can be early error
  function f() {
    return x;      // may or may not be error
  }
  escape(f);       // did this call f?
  let x = 42;
  escape2(f);      // did this call f?
}

Some on TC39 favor normative specification of early errors for the easily-decided cases. Others want runtime-only error checking all around and point out how even the easy cases (within straight-line code in the block's direct expression-statement children) testing that reaches the block will fail fast. The question remains: what if the block is not covered by tests?

Dave Herman brought up the let/var at top level equivalence implemented in SpiderMonkey, specifically in connection with <script> tags. Sketching in pseudo-HTML:

<script type=harmony>
  alert = 12;      // reassign built-in alert
</script>

<script type=harmony>
  let alert = 13;  // shadow built-in alert
  var quux = 14;   // this.quux = 14
  let quux = 15;   // alternative: in scope for later scripts?
</script>

<script>
  alert(quux);
</script>

Dave's point was not to commend the SpiderMonkey equating of let and var at top level, but to observe that if "let is the new var", then depending on how multiple successive script elements' contents are scoped, you may still need to use var in Harmony -- let won't be enough, if it binds only within the containing <script> element's scope.

Recall that Harmony removes the global (window in browsers) object from the scope chain, replacing it with a lexical environment with (generally) writable bindings. Each script starts with a fresh lexical environment, although it might be nested (see next paragraph).

For scripts that do not opt into Harmony, there's no issue. The global object is on the scope chain and it is used serially by successive script elements.

The question for Harmony scripts boils down to: should successive Harmony scripts nest lexical scopes in prior scripts' scopes, like matryoshka dolls? Or should each script opted into Harmony be its own module-like scope, in which case to propagate bindings to later scripts, one would have to

<script type=harmony>
  export let quux = 14; // available here and in later scripts
</script>

This remains an open question in TC39. Some liked the explicit 'export' requirement, the implicit module scope. Others objected that migrating code would expect the nested semantics, which was not inherently evil or unsafe.

--- end of block scope discussion ---

173 enh Normal All allen at wirfs-brock.com CONF --- FutureReservedWords should not be allowed as a function name or argument name of a strict func.

Deferred but this was considered straightforward, per the comment 0.

157 min --- All allen at wirfs-brock.com CONF --- "do{;}while(false)false" prohibited in spec but allowed in consensus reality        

Approved -- this is the de-facto standard whereby do;while(0)x will have a semicolon inserted before x.

== Minimal Classes ==

Dave presented his pitch for minimal classes, posted to es-discuss previously here:

https://mail.mozilla.org/pipermail/es-discuss/2011-June/015559.html

This subset includes class C {...}, class D extends C {...}, super calls in constructors, and method syntax for defining non-enumerable function-valued data properties on the class's prototype object.

The premise is that classes have significant open issues that will take time to resolve, at high opportunity cost, without clear consensus in sight for some of the issues; whereas the minimal "profile" has consensus already and will do good in ES.next without question.

Dave quickly acknowledged Mark M.'s long-running and indefatigable effort to get classes as sugar into Harmony, and how this was not in any way lost forever via minimal classses. Some of the early work used the closure pattern, which is not the main pattern supported by the current proposal (but it is supported if you write public methods in the constructor). So, credit to Mark for his work.

Waldemar thought the approach too minimal, and suggested zero-inheritance as a different axis on which to miminize. Others disagreed with that, noting how the existing pattern (the cowpath to pave) has at least subclassing and super-call boilerplate in library code and generated JS to absorb.

General agreement to work through open issues (recorded in the wiki or not) with the http://wiki.ecmascript.org/doku.php?id=harmony:classes proposal.

Open issues and discussion/resolution summaries:

1. return allowed from constructor?

Dave H.: This is a lesser cowpath possible with functions-as-constructors-with-prototypes in JS today, we should pave it.

Mark M., others: we want minimal object layout or "shape" declarative guarantees with classes, return {unshaped: "haha"}; in the middle of a constructor for a class with public x, y, z; properties defeats this goal.

Dave: no shape guarantees.

Others: must have shape guarantees, at least "the declared properties exist, at the moment the constructor returns" (ignoring const classes, which have frozen prototypes, instances, constructors, and at least sealed instance properties).

Mark M.: argument by analogy to module objects.

Dave, Sam, Brendan: module system is second class, module object reflections do not correspond to class-as-factory instances. This is not to say "no shape guarantees", however (Brendan at least).

Alex R.: minority-use-case users can fall back to declaring constructor functions if they need to return from constructor.

Consensus is: RESOLVED, return disallowed from class constructor body.

2. private: section vs. private prefix keyword

Some favor sections, others do not. Waldemar cites original-JS2/ES4 private {...} braced forms to distribute privaste, static (class), etc. across a group of declarations, with the braced body being a declaration list, not an object literal. Bob Nystrom proposed C++-style section syntax here:

http://www.mail-archive.com/es-discuss@mozilla.org/msg09070.html

No resolution.

3. private variable use-syntax

The private(this).x, private(other).x syntax in the wiki'ed proposal is an intentional non-starter: too verbose, wrongly suggests that private variables are properties of some "private data record" object.

But what to use instead? @ as prefix (this-based) and infix (restricted, no LineTerminator to left, for other-based) private-keyed property refs has been mooted but is not proposed in the classes proposal, and my sense is the committee is not ready to annex @.

Meanwhile, Allen proposed (see PDF link http://wiki.ecmascript.org/lib/exe/fetch.php?id=harmony%3Aprivate_name_objects&cache=cache&media=harmony:private-name-alternatives.pdf at bottom of http://wiki.ecmascript.org/doku.php?id=harmony:private_name_objects) that we allow private name objects to be used in object literals like so:

  return {
    [MyPrivateKey]: ...,
    publicNameHere: ...
  };

and this was agreed to at last week's meeting.

Given this extension, whose [] syntax mirrors the computed property name "indexing" used with private name objects as property keys, we agreed to defer private(this)/private(other) replacement syntax from the classes proposal and revisit later, based on usability experience with private name objects including this extension to object literal syntax.

4. class-side inheritance (method, this, super)

Some (Smalltalk and therefore Ruby matter to these folks) on TC39 want, others are indifferent. No one was hostile, but we did not resolve to add class-side inheritance yet.

People agree that "static" is the wrong keyword, but may have weight for some coming from C++ and Java. General desire to use "class" but not as prefix keyword.

5. no magic syntax for constructor body

This matters more if we attempt to unify class body syntax with object literal extended syntax, or somehow make a desugaring from one to the other. The general form of this open issue is item (6), but to focus on instance properties/variables, we split this one in two:

5a. public x = x inside a constructor taking parameter x. We do not have a better alternative at this time. The C++-style public: section idea discussed on the list (proposed by Bob Nystrom, see (2) above) separates declaration from likely initialization based on constructor parameters or other constructor/instance-dependent computation.

5b. private w = ... instead a constructor. Per 3, we removed this for now, deferring to private name objects and agreeing to revisit later.

6. do not abuse lexical binding declarative forms to define properties (prototype, class)

This was contentious at first, because modules use 'export function f(...){...}' and 'export const K = ...' declarative syntax extended by prefixing with 'export', so Mark at least found the use in class syntax of declarative forms to bind prototype properties plausible:

  class C {
    function f() {} // C.prototype.f?
    let x;          // C.prototype.x?
    const k;        // C.prototype.k?
    class Inner{}   // C.prototype.Inner, (c = new C, c.Inner)
    m(){}           // method m, no controversy, no comma after
    get gs(){}      // getter gs, no controversy, no semicolon after
    set gs(x){...}  // setter gs, optional, no controversy
    pp = 42;        // prototype property pp, controversial syntax -- assignment?
    constructor(){} // just a method, unless the body is special syntax (1)
  }

After some discussion it became clear that there is no symmetry with module instances without 'const class': module exports are sealed properties, module instance objects are not extensible. Class instances by default are extensible, and class prototypes in particular are as mutable as today. In general Harmony moves binding forms away from defining properties on objects (except via reflection, as in the module instance case) and to lexical scope.

This item was not resolved, but it left all of the variations sketched above with comments ending in ? as open issues. Methods (including constructor, even if its body is a special form), getters, and setters are ok. All others are not yet resolved as consensus features of classes in ES.next.

I observed that at the rate of progress resolving open issues in the classes proposal (counting generously), we needed 2.5 more meetings to resolve the rest. But the remaining issues are actually bigger, and lack live alternative proposals that helped resolve (1) and (5b).

To make progress, we need to avoid "hovering" at a bogus "consensus" where people agree with the idea or general goal of classes, without getting concrete final agreement on all the details.

/be
More information about the es-discuss mailing list