Skip to content

Instantly share code, notes, and snippets.

@yortuc
Created May 15, 2020 04:52
Show Gist options
  • Save yortuc/1e9b452f0a8275b1a43034e4924cb2f3 to your computer and use it in GitHub Desktop.
Save yortuc/1e9b452f0a8275b1a43034e4924cb2f3 to your computer and use it in GitHub Desktop.
Test Scala
import org.scalatest.flatspec.AnyFlatSpec
import com.yortuc.myProject
class MyIntegrationTest extends AnyFlatSpec with SparkSessionTestWrapper {
"myProject" should "Read the data from Parquet and save the output correctly" in {
// run the app as it is
// in this example with two cli parameters for input and output path
val inputFilePath = "./input-data.parquet"
val outputPath = "./output.parquet"
val expectedOutputPath = "./expected-output.patquet"
MyProject.main(Array(inputFilePath, outputPath))
// read the output of the application
val output = spark.read.parque(outputPath)
// read the expected results
val expected = spark.read.parque(expectedOutputPath)
// compare them.
// you can use a library such as spark-fast-tests to compare two dataframes or use this naive approach.
// But, dataframes should be ordered in the same way.
val comparison = output.except(expected)
// there should be no different rows
assert(comparison.count() == 0)
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment