Skip to content

Instantly share code, notes, and snippets.

@TomLous
Last active April 25, 2017 09:12
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save TomLous/c68e9c5ffcc38309c161ff8450fdd861 to your computer and use it in GitHub Desktop.
Save TomLous/c68e9c5ffcc38309c161ff8450fdd861 to your computer and use it in GitHub Desktop.
import spark.implicits._
val kvKDataset = spark.read.option("header", true).option("sep", ";").option("ignoreLeadingWhiteSpace", true).option("ignoreTrailingWhiteSpace", true).option("quote", """"""").option("nullValue", "").option("mode", "FAILFAST").csv(path)
kvKDataset.select(
'DOSSIERNR.as("dossierNummer"),
'VG_NR.as("vgNummer"),
'HANDELSN30.as("naamShort"),
'HANDN1_30.as("naamShortP1"),
'HANDN2_30.as("naamShortP2"),
'HANDELSN45.as("naamLong"),
'ADRES_VA.as("adresV"),
'PCPLAATSVA.as("postcodePlaatsV"),
'ADRES_CA.as("adresC"),
'PCPLAATSCA.as("postcodePlaatsC"),
'WPFT.as("wptf").cast(IntegerType),
'SBI.as("sbi").cast(IntegerType)
)
.as[KvKRecord]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment