Skip to content

Instantly share code, notes, and snippets.

@danielocampo2
Last active January 13, 2023 21:20
Show Gist options
  • Star 8 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save danielocampo2/8ddd4ce20e9dbb277dee469d6349084a to your computer and use it in GitHub Desktop.
Save danielocampo2/8ddd4ce20e9dbb277dee469d6349084a to your computer and use it in GitHub Desktop.
multiDistinctBy function for Kotlin: Like stdlib distinctBy but for multiple fields
fun <T, K> Iterable<T>.multiDistinctBy(vararg selectors: (T) -> K): List<T> {
require(selectors.isNotEmpty())
val set = HashSet<Int>()
val distinct = ArrayList<T>()
for (element in this) {
val key = selectors.fold(0) { sum, selector ->
sum.plus(selector(element)?.hashCode() ?: 0) }
if (set.add(key))
distinct.add(element)
}
return distinct
}
import org.junit.Assert.assertEquals
import org.junit.Test
class MultiDistinctByTest {
data class Item(val code: String, val name: String?, val value: Int)
@Test
fun multiDistinctBy_byTwoStringSelectors_shouldReturnOnlyDistinctValues() {
val someItems = listOf(
Item("1", "Item 1", 34),
Item("1", "Item 1", 36),
Item("2", "Item 1", 38),
Item("2", "Item 2", 40),
Item("2", "Item 3", 42),
Item("2", "", 44),
Item("2", "", 46)
)
val expected = listOf(
Item("1", "Item 1", 34),
Item("2", "Item 1", 38),
Item("2", "Item 2", 40),
Item("2", "Item 3", 42),
Item("2", "", 44)
)
val distinct = someItems.multiDistinctBy({it.code}, {it.name})
assertEquals(expected, distinct)
}
@Test
fun multiDistinctBy_withNullValue_shouldIncludeNullValue() {
val someItems = listOf(
Item("1", "Item 1", 34),
Item("1", "Item 1", 36),
Item("2", null, 38),
Item("2", "Item 2", 40)
)
val expected = listOf(
Item("1", "Item 1", 34),
Item("2", null, 38),
Item("2", "Item 2", 40)
)
val distinct = someItems.multiDistinctBy({it.code}, {it.name})
assertEquals(expected, distinct)
}
@Test
fun multiDistinctBy_withTwoNullValuesInSelectorField_shouldReturnOnlyOneNullValue() {
val someItems = listOf(
Item("1", "Item 1", 34),
Item("1", "Item 1", 36),
Item("2", null, 38),
Item("2", null, 40)
)
val expected = listOf(
Item("1", "Item 1", 34),
Item("2", null, 38)
)
val distinct = someItems.multiDistinctBy({it.code}, {it.name})
assertEquals(expected, distinct)
}
@Test
fun multiDistinctBy_withStringAndNumeric_shouldReturnDistinctByStringAndNumericFields() {
val someItems = listOf(
Item("1", "Item 1", 34),
Item("1", "Item 1", 34),
Item("2", null, 38),
Item("2", null, 40)
)
val expected = listOf(
Item("1", "Item 1", 34),
Item("2", null, 38),
Item("2", null, 40)
)
val distinct = someItems.multiDistinctBy({it.code}, {it.value})
assertEquals(expected, distinct)
}
@Test
fun multiDistinctBy_allFieldsAreDifferent_shouldReturnOriginalList() {
val someItems = listOf(
Item("1", "Item 1", 34),
Item("1", "Item 2", 34),
Item("2", null, 38),
Item("2", null, 40)
)
val distinct = someItems.multiDistinctBy({it.code}, {it.name}, {it.value})
assertEquals(someItems, distinct)
}
}
@Intex32
Copy link

Intex32 commented Jan 9, 2023

Hey!
I accidently came across this code snippet while trying to findet a solution for distinct by multiple fields.
And this function behaves incorrect in a special case I found.

Given the following code

data class Cookie(
    val name: String,
    val size: Int,
)

val cookies = listOf(
    Cookie("CookieA", 10),
    Cookie("CookieB", 9),
)
val distinctCookies = cookies.multiDistinctBy({ it.name }, { it.size })
val cookieHashes = cookies.map { it.name.hashCode() + it.size.hashCode() }

println(cookies)
println(distinctCookies)
println(cookieHashes)
println("Hashes Identical: ${cookieHashes.distinct().size == 1}")

console output:

[Cookie(name=CookieA, size=10), Cookie(name=CookieB, size=9)]
[Cookie(name=CookieA, size=10)]
[-1678124473, -1678124473]
Hashes Identical: true

This is because you're summing the individual hashes and the hash of "name" increases whereas the hash of "size" decreases.
I came across this is issue with real data... So its not just an edge case.

Btw, this function signature would be safer:

fun <T, K> Iterable<T>.distinctBy(selector0: (T) -> K, selector1: (T) -> K, vararg selectorsN: (T) -> K): List<T> {
    val selectors = listOf(selector0, selector1, *selectorsN)
    TODO()
}

I don't think there is an easy general solution to this.

@danielocampo2
Copy link
Author

Hi @Intex32, thanks, this was a blast from the past, so long I haven't seen this gist.
I will take a look and see if there is any way this can be fixed, thanks for the suggestion.

@Intex32
Copy link

Intex32 commented Jan 13, 2023

@danielocampo2 a "blast from the past" xD very nice
Don't feel obligated however to fix this. I am now using distinctBy { listof(...) }, it appears to be working for the time being.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment