Navigation Menu

Skip to content

Instantly share code, notes, and snippets.

@TomLous
Created April 2, 2017 08:54
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save TomLous/95489393f4c459e6d4e001e4aa8b22f2 to your computer and use it in GitHub Desktop.
Save TomLous/95489393f4c459e6d4e001e4aa8b22f2 to your computer and use it in GitHub Desktop.
val comparableDataset = kvKDataset.as("l")
.joinWith(
kvKDataset.as("r"),
$"l.adresV" === $"r.adresV" && $"l.postcodePlaatsV" === $"r.postcodePlaatsV" && $"l.dossierNummer" =!= $"r.dossierNummer"
).map {
case (left, right) => (left, right, Vectors.dense(left.distance(right).toArray))
}
.toDF("left", "right", "features")
.as[(KvKRecord, KvKRecord, Vector)]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment