Skip to content

Instantly share code, notes, and snippets.

@KyLeggiero
Last active May 13, 2018 19:24
Show Gist options
  • Save KyLeggiero/1582a959592cadcfee2a0beba3820084 to your computer and use it in GitHub Desktop.
Save KyLeggiero/1582a959592cadcfee2a0beba3820084 to your computer and use it in GitHub Desktop.
Proposal: Kotlin collection literals
@cypressious
Copy link

Very nice. Two points of feedback:

Requiring the operator function to be put in the companion object doesn't seem very consistent with the other operators. I would rather require it to be top-level. The return type should be enough to resolve the correct operator. As with any operator, it needs to be imported at the use-site, so it doesn't need extra scoping.

Using Pairs for the dictionary operator, while simple, is not very efficient. It's required for the mapOf function in the stdlib because there's no language feature but if you're going to add a dedicated operator, maybe it can be made more efficient, too. Just some food for thoughts, no concrete solution here.

@raulraja
Copy link

raulraja commented Apr 6, 2018

How would this new syntax affect to potentially new implementations of collection such as immutable List https://github.com/Kotlin/kotlinx.collections.immutable/blob/master/kotlinx-collections-immutable/src/main/kotlin/kotlinx/collections/immutable/ImmutableList.kt ?
Would custom collections created by a user also benefit from this syntax based on receiver type or through some other mechanism?

@cbeust
Copy link

cbeust commented Apr 6, 2018

val b1: Array = ["a", "b", "c"] // Array

Not a fan of seeing raw types being introduced, I'd rather eat the verbosity and require a full type instead.

@KyLeggiero
Copy link
Author

KyLeggiero commented Apr 7, 2018

@cypressious I did say "This must be implemented in either its companion object or via an extension function".

The companion object thing may be foregone if that's what the majority wants (or if there's a compiler requirement for that), but I just thought it would be a nice place to put it when writing and reading the code. I also notice that I implied, but never explicitly said, top-level when I said "extension function". Thanks for pointing that out :)


@raulraja It should not affect those. The syntax would be the same as the other examples I provide:

val b8: ImmutableList<Int> = [1, 2, 3]

This would likely be implemented as:

operator fun sequenceLiteral<T>(vararg elements: T): ImmutableList<T> = immutableListOf(*elements)

And custom implementations can still take advantage of this:

class MyImmutableList: ImmutableList<String> {
    // implementation stuff...
}


operator fun sequenceLiteral(vararg elements: String): MyImmutableList = MyImmutableList(*elements)

val b9: MyImmutableList = ["Won", "Too"]

@cbeust Good point. I think I forgot Kotlin doesn't have those right now; I never meant for that to be bundled in with this proposal. Thanks for pointing it out!

@ilya-g
Copy link

ilya-g commented Apr 8, 2018

@cbeust In Kotlin these are called bare types. You can use them in downcasts, for example: val list = someCollection as List

@ilya-g
Copy link

ilya-g commented Apr 8, 2018

@BenLeggiero Could you expand on what happens if we have two overloads of some function, say foo(Iterable<T>) and foo(Sequence<T>), and it is called with a sequence literal foo(['a', 'b', 'c']), given that both Iterable and Sequence have sequenceLiteral operator defined?

@gildor
Copy link

gildor commented Apr 9, 2018

@BenLeggiero Good proposal, this idea raised already a few times, probably it's time to discuss it with Kotlin Team and hear their thoughts. Do you plan to publish it as KEEP?

I see a few points that would be good to clarify.

  1. Do we really need dictionaryLiteral? Looks like it's just specific version of sequenceLiteral and can be easily replaced with:

    operator fun <K, V> sequenceLiteral(vararg pair: Pair<K, V>): Map<K, V> = mapOf(*pair)
    

    I just don't see any advantages of dictionaryLiteral.

  2. Looks that you can split this proposal into two: sequenceLiteral operator with literal syntax and Pair literal, because you can easily replace your existing proposal [1 : 2, 3 : 4] with [1 to 2, 3 to 4], so no need for special syntax for Map.
    I understand, this is probably not what you want, but it's just not necessary for current proposal to introduce a way to create Pair object using a colon instead of the standard to. Colon syntax would be useful if we could optimize map creation, but even for this proposal it doesn't look necessary.
    Maybe would be better to discuss pair literal separately, such big feature probably should be discussed also for another use cases, not only for collections literals.

  3. All existing collection/map builders provide also an optimized version for a single argument without vararg. Maybe would be good to support this case too. So allow not the only operator with vararg:
    operator fun <T> sequenceLiteral(vararg elements: T): Type

    but also an operator with a single argument;
    operator fun <T> sequenceLiteral(elements: T): Type
    or even with the arbitrary amount of arguments:
    operator fun <T> sequenceLiteral(elem1: T, elem2: T, elem3: T): Type

@gildor
Copy link

gildor commented Apr 9, 2018

After my comment, I realized that sequence operator with the arbitrary amount of arguments can be used as tuple constructor if we allow defining own type for each argument:

operator fun sequenceLiteral(name: String, age: Int): User = User(name, age)

// And used as
users: List<User> = [["Alice", 25], ["Bob", 42], ["Eve", 30]]

This is a side effect of course and not sure that it should be allowed, but it's definitely can be useful for DSLs and some other cases.
But actually, use of parentheses for tuple-like objects is much more common than use of square brackets. So for such cases would be better to use some sort of tuple literal and corresponding operator
So it's just comment, do not consider as an attempt to add it to this proposal

@cypressious
Copy link

@BenLeggiero I don't see any necessity for the companion object at all. Neither the value nor the type of the companion object is used in the operator function. It's completely redundant.

Also, imagine a collection type written in Java that doesn't have a companion object. How would you write a sequenceLiteral operator for it?

@KyLeggiero
Copy link
Author

@KyLeggiero
Copy link
Author

@gildor beautiful observations! I actually agree with all of them! Though I do love the colon syntax and hate the to syntax, you're absolutely right that there's no current difference from the program's perspective and it could very much become better-optimized. I'll hold that off for a future proposal.

That tuple side-effect is beautiful as well and looks like it could be very useful for the language. I think I like it more than you do, but you're right in that it would just be a side-effect and not a feature :P

@KyLeggiero
Copy link
Author

@cypressious there is no necessity for the companion object part of this proposal; it's just positional sugar. It probably won't affect the compiler-side implementation all that much and some people like that syntax so it's a non-issue in my eyes. Just something to be kind to anyone who wants it, but won't affect anyone who doesn't.

@alanfo
Copy link

alanfo commented Apr 9, 2018

The only thing I dislike about this proposal is that the default type for sequence literals is List<T> rather than array (dedicated array for the primitive types and Array<T> for all other types).

Whilst it's true that the Kotlin standard library favors lists over arrays (and for good reason), I suspect the reverse is the case for situations where one would want to use a sequence literal - it certainly would be for my own code. There are three reasons why I say this:

  1. Literals in other languages are very often used to hard code sequences or tables of numbers and anyone who cares about performance in Kotlin is going to use dedicated arrays (e.g. DoubleArray rather than List<Double> or Array<Double>) for these to avoid the overhead of boxing even if they have no plans to mutate the elements.

  2. The present array factory methods such as doubleArrayOf and booleanArrayOf are more long-winded than listOf.

  3. Array literals have already been introduced for annotations in version 1.2.

If we are going to have a default dictionary literal, then it's clear that you can't use 'to' to separate the key/value pairs because, if you did, then it might signify a list (or array) of pairs rather than a map. Personally, I prefer the colon separator which you've used in the proposal though I've also seen '=' suggested as an alternative.

@KyLeggiero
Copy link
Author

@alanfo great points! I thought of many of these while writing.

  1. That is true, and to make it the most efficient they're probably going to specify the type explicitly, like val x: FloatArray = [1.0f] or val y: ShortArray = [2, 3]. So, even if the default type was an Array with an implied element type, that wouldn't affect this particular audience.
  2. I agree. This syntax combined with explicit typing will rid us of the need for those factory methods, too! Either way, making it an Array by default won't change much. With both my proposal and with yours, the syntax of val z: BooleanArray = [true, false] and takesBooleanArray([true, false]) is identical, and vastly improved from the existing takesBooleanArray(booleanArrayOf(true, false)).
  3. That is a great point, and this won't change that. The type hinting will kick in and the syntax will stay exactly the same for annotation arrays.

@gildor
Copy link

gildor commented Apr 10, 2018

If we are going to have a default dictionary literal, then it's clear that you can't use 'to' to separate the key/value pairs because, if you did, then it might signify a list (or array) of pairs rather than a map

@alanfo This is a good point, but probably a simplification of this proposal would be still better to use to for now:

val e2: Map<Int, Int> = [1 to 2, 3 to 4] 
val list = [1 to 2, 3 to 4] // List<Pair<Int, Int>>

Of course, would be better to get Map for the second case, but it's not clear to me how type inference can solve this (choose the most specific type?), also it's related to @ilya-g question about multiple sequenceLiterals

@alanfo
Copy link

alanfo commented Apr 10, 2018

@BenLeggiero Reading my previous post again, I don't think I was clear about what I meant by having array rather than List<T> as the default type for sequence literals. What I'm actually suggesting is that the default type should be:

  1. In the case of a sequence of primitive types, the dedicated array type for that primitive type.
  2. In all other cases, Array<T>.

So, using the first example in your proposal to illustrate:

val a1 = ["a", "b", "c"] // Array<String>
val a2 = [1, 2, 3] // IntArray (not Array<Int>)
val a3 = [true, false] // BooleanArray
val a4 = ["a", 2, false, null] // Array<Any?>
val a5 = [] // Compiler error: type cannot be inferred (just like `arrayOf()`). 

Similarly, for a table of numbers, we'd then have:

val t1 = [ [1.0, 1.5], [2.0, 2.5] ] // Array<DoubleArray>
val t2 = [ [1, 2, 3], [4, 5, 6], [7, 8, 9] // Array<IntArray>

There would be no boxing for the dedicated array types and so folks who do a lot of numerical work would be happy.

@gildor I agree, of course, that the use of 'to' is only a problem if we have a default dictionary literal (assuming we have any dictionary literals at all).

However, it's worth remembering that in last year's survey of possible future features the Kotlin team themselves proposed having a dictionary literal and (IIRC) they tentatively proposed using '=' for the separator to avoid any ambiguity from using 'to'. I therefore think its worth keeping dictionary literals as part of the proposal.

As regards @ilya-g's question about determining which overload should be called when sequence literals are defined for two (or more) relevant types, I think the answer must be that either you cast the literal to the specific type you want or use the existing factory function instead of a literal to resolve any ambiguity.

@KyLeggiero
Copy link
Author

@alanfo

I understand what you're getting at. That's adding too many "blessed" types. This is something I discussed with @raulraja on the #language-proposals channel in the Kotlin Slack. Basically, he was concerned that this proposal is too reliant on the internal implementation of List et al.

Your idea seems to take that problem to the next level by saying that if a sequence literal seems to be an array of Ints, then it should be an IntArray. I am worried that would paint us into a corner that is too tight. It solves a problem for a niche audience that, I think, wouldn't mind providing the explicit types required by my proposal. It provides no new functionality at the sacrifice of a lot of functional risks that Kotlin's stdlib is actively avoiding by pushing people towards Lists and away from Arrays everywhere it can. It also introduces confusion to newcomers, who might wonder why so many things accept and return Lists, but an implicitly-typed sequence literal generates an array. I firmly believe that, if it weren't for JVM constraints, the main function would take a List<String> (or there would be some other way to get those args), and varargs would be a List as well.

Again, my proposal doesn't preclude using these specialized, more-efficient types:

val a1: Array<String> = ["a", "b", "c"] // Array<String>
val a2: IntArray = [1, 2, 3] // IntArray (not Array<Int>)
val a3: BooleanArray = [true, false] // BooleanArray
val a4: Array<Any?> = ["a", 2, false, null] // Array<Any?>
val a5 = [] // Compiler error: type cannot be inferred (just like `listOf()`). 

val t1: Array<DoubleArray> = [ [1.0, 1.5], [2.0, 2.5] ] // Array<DoubleArray>
val t2: Array<IntArray> = [ [1, 2, 3], [4, 5, 6], [7, 8, 9] // Array<IntArray>

This would be achieved through uses of that custom sequence type function, like this, possibly even in stdlib itself:

operator fun sequenceLiteral(vararg elements: Int) = IntArray(size = elements.size, init = { elements[it] })
// etc.

As you can see, this is not that much more syntax to give the compiler the proper hinting, especially for the use of something that is niche andor discouraged in Kotlin. Folks who do a lot of numerical work are still happy because there is no boxing, and folks who don't are still happy because they avoid the dangers of Arrays (which many of them might not be aware of in the first place).

@KyLeggiero
Copy link
Author

KyLeggiero commented Apr 11, 2018

@ilya-g My proposal does not include any new ambiguity-resolution features. That is to say, it would act just same as if you were trying to write a function that's just as ambiguous: it simply wouldn't let you write code that ambiguous. Here's an example you can try today in Kotlin 1.2, which I imagine wouldn't change between now and the implementation of this proposal:

fun foo(x: Iterable<Char>) {}
fun foo(x: Sequence<Char>) {}

/*operator*/ fun sequenceLiteral(vararg elements: Char): Iterable<Char> {
    print("Used Iterable<Char>")
    return elements.asList()
}
/*operator*/ fun sequenceLiteral(vararg elements: Char): Sequence<Char> {
    print("Used Sequence<Char>")
    return Sequence { elements.iterator() }
}


fun main(args: Array<String>) {
    foo(sequenceLiteral('a', 'b', 'c'))
}
Error:(5, 13) Conflicting overloads: public fun sequenceLiteral(vararg elements: Char): Iterable<Char> defined in root package in file Sequence Literal Ambiguity.kt, public fun sequenceLiteral(vararg elements: Char): Sequence<Char> defined in root package in file Sequence Literal Ambiguity.kt
Error:(9, 13) Conflicting overloads: public fun sequenceLiteral(vararg elements: Char): Iterable<Char> defined in root package in file Sequence Literal Ambiguity.kt, public fun sequenceLiteral(vararg elements: Char): Sequence<Char> defined in root package in file Sequence Literal Ambiguity.kt
Error:(16, 8) Overload resolution ambiguity: 
public fun sequenceLiteral(vararg elements: Char): Iterable<Char> defined in root package in file Sequence Literal Ambiguity.kt
public fun sequenceLiteral(vararg elements: Char): Sequence<Char> defined in root package in file Sequence Literal Ambiguity.kt

See this live: https://try.kotlinlang.org/#/UserProjects/3lulp9pbimkgkpfg9kolm3luk1/hm7j3fck2ubjqtklb4onqtlcp7

@alanfo
Copy link

alanfo commented Apr 11, 2018

@BenLeggiero

I accept, of course, that using arrays in the fashion I've described is more complicated than using List<T> and more difficult therefore for people to get their heads around, particularly when it seems to be at odds with the standard library's philosophy of preferring lists to arrays.

However, my fear is that sequence literals won't be useful as they might otherwise be. For example, faced with a choice between:

val a2: IntArray = [1, 2, 3]
val e2 = intArrayOf(1, 2, 3)

I think a lot of folks would just stick with the latter.

Also regardless of whether lists or arrays are the default, there is the fundamental difficulty that given a choice between:

val a1: Array<String> = ["a", "b", "c"]
val e1 = arrayOf("a", "b", "c")

nearly everyone would choose the latter because the factory functions can infer their type and it's therefore shorter to write.

However, where I think your proposal scores heavily is that it makes multi-dimensional tables much less verbose than they currently are. It's clearly better to write:

val t2: Array<IntArray> = [ [1, 2, 3], [4, 5, 6], [7, 8, 9] ] 

than:

val m2 = arrayOf(intArrayOf(1, 2, 3), intArrayOf(4, 5, 6), intArrayOf(7, 8, 9))

In fact, this the main reason why I'm in favor of sequence literals at all as I don't think the present system of using factory functions for single dimensional sequences is all that bad.

So, in conclusion, I hope you'll put it forward as a KEEP and (FWIW) I'll still support it as it stands even if I'd have preferred arrays to be the default :)

@KyLeggiero
Copy link
Author

@alanfo I think you're focusing too much on my example usage which was chosen to make the type inference clear. You forget other times when the amount of code is reduced dramatically because the type is not part of the line:

foo.intArrayTypedField = [1, 2, 3]
foo.intArrayTypedField = intArrayOf(1, 2, 3)


functionThatTakesAnIntArray([1, 2, 3])
functionThatTakesAnIntArray(intArrayOf(1, 2, 3))


functionThatReturnsAnIntArray_1() : IntArray {
    // do stuff?
    return [1, 2, 3]
}
functionThatReturnsAnIntArray_2() : IntArray {
    // do stuff?
    return intArrayOf(1, 2, 3)
}

So the pros for each approach are, as far as I am aware:

List Array
(And specialized array types)
Immutability Generally more efficient
Custom implementations
Ubiquitous in Kotlin stdlib
Easier for beginners

So, since it seems the amount of code is only ever equal to or less than the current amount, combined with all the reasons about wanting to prioritize decreasing confusion and over catering to niche uses, I won't be adopting your approach into this proposal.

In addition, I would hope that, alongside adopting this into the language, stdlib would deprecate (and eventually drop) the current top-level-factory-function-based approach to sequence generation.

On a side note, I also hope we someday get even better inference, so we don't have to specify the generic type, like:

val x: Array = ["a", "b"] // Inferred Array<String>

I know that's not possible now, but it might be someday and that would make my life much better :)

🍻 Thanks for your support and great arguments!

@reitzig
Copy link

reitzig commented Apr 16, 2018

FWIW, I agree that introducing : as new syntax is not necessary, and I'll add that it's not in line with how Kotlin reads today. Due its use in typing, : reads as "is a". Here, we want "maps to" or "if lhs then rhs" which is most closely matched by ->. But I honestly think that using pairs via to for dictionary literals is fine.

@KyLeggiero
Copy link
Author

@reitzig Yeah, I agree. Maybe = would be a better fit, like we use in named function parameters. I'm still ruminating, but I'm close to finalizing. Will edit more this week before making a KEEP.

@alanfo
Copy link

alanfo commented Apr 18, 2018

@BenLeggiero

I've found the link to the Future Features Survey now and, if you check out feature no. 6, you'll see that JB themselves suggested '=' as the separator for map literals.

Another possibility for you to ruminate on would be to use a prefix to distinguish map literals from sequence literals. If you did this, then you could still use 'to' as the separator for the former.

The hash symbol '#' suggests itself as a suitable prefix (and mnemonic) for maps and I don't think it would clash with anything else. So, for example, you'd have:

val list1 = ['a' to 1, 'b' to 2,  'c' to ])  // List<Pair<Char, Int>>

val map1 = #['a' to 1, 'b' to 2,  'c' to 3]  // Map<Char, Int>

@ssadedin
Copy link

Some random thoughts from someone whose opinion should probably not be given a whole lot of weight since I'm relatively new to Kotlin: as a long time Groovy, Python and Javasript user I very much appreciate the concision of Map literals in those languages. It is incredibly useful in creating DSLs, using in REPL contexts, and many other situations. I feel like when there's a common shared syntax between many languages with high similarity, there's a strong benefit in just using that too unless it sharply deviates from your principles. And one of the things I'm liking about Kotlin is that it doesn't seem to gratuitously differentiate its syntax from that of other languages : where it make sense for things to be the same they mostly are.

Which is all to say, I'd prefer a single character, either : or = as the map character, even if it does mean new syntax / slightly deviating from other conventions used in Kotlin.

@KyLeggiero
Copy link
Author

@alanfo thank you for that! I don't think I like the = operator being used here because it would be unclear that assignment is not happening. For example:

var foo = "Foo"

var bar = [foo = "Bar"]

This seems, to me, unclear whether you get a map of foo to "Bar", or a list of foo, where foo has the value "Bar".

Using the hash symbol in this way certainly does not clash with current features, but... I would not be confident attempting to claim it for this feature, as it seems more fit for conditional compilation if we ever have that. Good idea, though, to use a special symbol as part of the operator.

@KyLeggiero
Copy link
Author

@ssadedin I agree with you. Kotlin, among many other things, aims to be easy to learn for those who already know another programming language. It would be silly for us to use a brand new syntax for something so commonplace. That's why I chose the same syntax used in Groovy and Swift, which only barely differs from that used in Python and JS (square vs curly braces).

@alanfo
Copy link

alanfo commented May 8, 2018

@BenLeggiero I agree with you that ':' seems on the face of it better than '='.

However, there was a development recently in another long running debate - whether to include the conditional (ternary) operator in Kotlin.

I've always been against this, partly because it's unnecessary when we already have the if/else expression, but mainly because '?' signifies something to do with nullability in Kotlin.

I thought that the latter would be the critical point but it turns out that the ':' is the problem! This is because the Kotlin team want to reserve its use for something else such as slices. So, if you do propose it for 'map' literals, you might run up against the same objection.

@KyLeggiero
Copy link
Author

Thank you for the insight, @alanfo! That's very useful :)

@KyLeggiero
Copy link
Author

KyLeggiero commented May 11, 2018

I've released the first draft of the collection literals KEEP proposal. You can view it on my GitHub here:
https://github.com/BenLeggiero/KEEP/blob/collection-literals/proposals/collection-literals.md

I have started going over it and refining it to become a final draft before submitting a pull request. Any comments about it here would be greatly appreciated!

@KyLeggiero
Copy link
Author

KyLeggiero commented May 13, 2018

This is now in a KEEP proposal: Kotlin/KEEP#112
Please place further discussion there :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment