Skip to content

Instantly share code, notes, and snippets.

@l-Luna
Last active January 24, 2022 16:14
Collection Literals

Collection Literals

How far this goes

This proposal is not intended to add support for general-purpose custom initializer syntax for any kind of object, nor is it intended to allow for custom DSLs in Java. This is for simple shorthand:

Map<String, Object> values = { "a": "alpha", "b": "beta", null: this };

var list = ArrayList<String> { "a", "b", "c" };

Set<Direction> checked = HashSet<> { NORTH, SOUTH, NORTHWEST, };

checkForWarnings({ "unchecked", "deprecation" }); // is a Set

which are more convenient and easy-to-read than currently-available factory methods.

It should be possible to specify the type of the collection, as well as have a reasonable default inferred from context. It should also be possible to specify both mutable and immutable collection literals, with a reasonable default that is clear from the context.

Main points

  • Collection literals are a concise way of specifying a value of any type that can be described as an arbitrary-length array of values or key-value pairs, intended for types that act as a (mostly) transparent carrier of this data (a collection).
  • They are not intended for use by things that are not collections as just an "alternative constructor", nor are they intended as a way for libraries to approximate a different language (e.g. JSON or HTML) in Java.
  • The type of a collection literal can be specified or inferred. The exact runtime type of the value produced is not guaranteed to be the specified type, but it must be assignable to that type.
  • Types opt-in to this syntax by providing static factory methods.
  • The mutability of the resultant value can be specified using the final or !final modifiers, or is inferred from the use of the expression (assigned to a final/non-final variable), if the target type supports both mutable and immutable versions (e.g. List rather than ArrayList).
  • Immutable collections of constant values could then be checked by the compiler and potentially translated into a constant value via constant dynamic, reducing redundant object allocation.

Usage

List literals and map literals, collectively known as collection literals are a new kind of expression that describe a collection concisely. Any type can opt-in to being described as either kind of literal by defining appropriate factory methods.

List literals looks like: ArrayList<String> { "a", "b", "c" }. A type is specified, followed by a comma-separated list of expressions in braces. A trailing comma is allowed.

Map literals looks like: HashMap<String, Object> { "a": "alpha", "b": "beta", null: this }. A type is specified, followed by a comma-separated list of map entries, which are colon-seperated key-value pairs. There is no special syntax for String keys. A trailing comma is allowed.

Note the lack of the new keyword, despite their use in array literals. Collection literals call a factory method, not an initializer; they are not guaranteed to return an instance of exactly the specified type; and you cannot use a collection literal with an anonymous type body. These reasons make them different to constructors, and I want to avoid confusion between the two.

Where the type of a collection literal can be inferred, the type can be omitted entirely:

List<String> list = { "a", "b", "c" };

If you wish to specify the specific type, and generic type parameters of a generic collection literal's type can be determined, the diamond operator may be used:

List<String> list = ArrayList<> { "a", "b", "c" };

Mutability

A type that ops-in may specify both a "final" (immutable) and "non-final" (mutable) factory. By default, a collection literal will use the final version if it exists, unless the literal is the right-hand-side of an assignment to a non-final variable:

var list = List<String> {}; // mutable
var list2 = (List<String> {}); // also mutable
final var list3 = List<String> {}; // immutable
final var list4 = ArrayList<String> {}; // mutable, because ArrayList can only be mutable
doThing({ "a", "b" }); // collection is immutable

You can manually specify whether a collection is final or non-final by using final or !final before the expression:

List<String> list = final { "left", "right", "left" }; // immutable
final var list2 = !final List<String> {}; // mutable
doThing(!final { "a", "b" }); // collection is mutable

It is a compile time error to specify final for a type with no final factory, or non-final for a type that only has a final factory. For example, final ArrayList<> {} would cause a compile-time error, as ArrayLists are always immutable and should not specify a final factory.

Opting in

A type can opt-into being described with collection literals by providing at least one of four factory methods:

  • Type _newCollection(_T...)
  • Type _newFinalCollection(_T...)
  • Type _newMap(Map.Entry<_K, _V>...)
  • Type _newFinalMap(Map.Entry<_K, _V>...)

where Type is the type declaring the factory, and placeholder types _T, _K, _V here may be an arbitrary concrete type (e.g. a Json type that only accepts Json objects) or a generic type (e.g. a generic list accepting values of its generic type). If the type of the parameter is generic, the generic type is inferred in the same way as it would be when directly calling these functions.

(These names are only temporary placeholders.)

It would be useful to have some clear syntax for declaring these factories, though I believe they should still be callable via a regular call statement or expression for discoverability, rather than being a nameless method like constructors. It may be useful to use some type other than Map.Entry for the intermediate representation of map parameters, though reusing the existing type may be useful for straight Map implementations.

These methods don’t have to be public; literals will only be usable in contexts where they are accessible. If final and non-final versions are defined, they must have the same visibility; otherwise, the same variable declaration statement may create a mutable collection in one context and an immutable collection in another, which might only cause an error after being passed around without a link to the original statement.

Syntax

My proposed syntax for a collection literal is:

PrimaryNoNewArray:
    ...
    CollectionLiteral

CollectionLiteral:
    ListLiteral
    MapLiteral

ListLiteral: [[!] final] [TypeName] { [Expression {, Expression} [,]] }
MapLiteral: [[!] final] [TypeName] { [MapLiteralEntry {, MapLiteralEntry} [,]] }
MapLiteralEntry: Expression : Expression

Standard Java Collections implementations

The types List, Map, and Set should provide mutable and immutable factories. The specific types that are returned should be an implementation detail that can be freely changed by JDK maintainers, and it may be useful to use an anonymous or non-public implementation to prevent users from introducing a dependency on any particular returned type.

The standard public implementations of these - ArrayList, HashSet, HashMap, etc - should also provide factories, so users that want to easily use and rely on a specific implementation can do so, without forcing every user to pick a particular implementation every time.

Constant immutable collections

Immutable collections made up of constant values could be compiled to be loaded as a dynamic constant, using a bootstrap method that calls the target final factory with the appropriate constant arguments, which could provide less memory use than producing a new redundant object every time.

Similarly, immutable empty collections could share the existing empty collection instances already used by Collections.

Care may need to be taken that such constant folded collections cannot be later modified; in particular, if an immutable factory accidently (or intentionally) creates a mutable collection that is later modified, the original expression would no longer produce values that match the text and would be difficult to debug. It may be that only certain standard library implementations can be constant folded in this way, perhaps marked by an annotation.

Issues

  • List.of and co might be good enough, compared to adding and supporting a new language feature with potentially new collections implementations. I believe the potential for simplifying code would make it worth it.
  • Allowing the creation of List literals (and such) may lead to users simply defaulting to those, even if there's a more appropriate type for their use-case. However, only allowing the use of these expressions with specific implementations while allowing type inference would punish code that (correctly) works with any List, Map, or Set implementation; and not allowing inference at all provides little improvement over static factories.
  • The inference of mutability based on variable assignment could lead to accidentally creating a mutable collection, and could be difficult to read; it may be better to simply say that immutable collections are always the default, and that you must always either specify a mutable implementation or use !final. I'd especially appreciate feedback on this.
  • I'm not sure about the exact syntax, especially for the factory methods and !final - everything here is just a "reasonable guess". The exact syntax can be decided on after the semantics (if a feature like this is desired).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment