Skip to content

Instantly share code, notes, and snippets.

@ganeshchand
Last active June 2, 2022 21:05
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ganeshchand/e1759efa970f301f6fd541d5b9d429b5 to your computer and use it in GitHub Desktop.
Save ganeshchand/e1759efa970f301f6fd541d5b9d429b5 to your computer and use it in GitHub Desktop.
// Initial Implementation: Counting rows to check if there is input data or not
val inputStockData: DataFrame = spark.read.json("/path/to/json/files")
val numInputRows = stockData.count
val isInputDataEmpty = if(numInputRows > 0) false else true
if(!isInputDataEmpty) {
// process input data
} else {
// no input data. Skip processing
}
// Implementation v2: instead of counting the rows to check the emptyness, use isEmpty api provided by the Dataset API. This is more performant.
val inputStockData: DataFrame = spark.read.json("/path/to/json/files")
if(!inputStockData.isEmpty) {
// process input data
} else {
// no input data. Skip processing
}
// Implementation v3: Scala Idiomatic way using pattern matching
val inputStockData: DataFrame = spark.read.json("/path/to/json/files")
inputStockData.isEmpty match {
case false => // process input data
case true => // no input data. Skip processing
}
// BONUS implementation v3: The api is biased on the emptyness. Does isNotEmpty feel more natural to you? Well, unfortunately there is not such in-built API.
// But, this is where Scala is incredibly powerful and magical. It allows you to add extension method. Let's do that.
implicit class MyDatasetExtenion[T](ds: Dataset[T]) {
def isNotEmpty: Boolean = !ds.isEmpty
}
val inputStockData: DataFrame = spark.read.json("/path/to/json/files")
inputStockData.isNotEmpty match {
case true => // process input data
case false => // no input data. Skip processing
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment