Skip to content

Instantly share code, notes, and snippets.

View alexkon's full-sized avatar

Alexander Konovalov alexkon

View GitHub Profile
@alexkon
alexkon / DataFrameSuite.scala
Created June 13, 2018 16:41
DataFrameSuite allows you to check if two DataFrames are equal. You can assert the DataFrames equality using method assertDataFrameEquals. When DataFrames contains doubles or Spark Mllib Vector, you can assert that the DataFrames approximately equal using method assertDataFrameApproximateEquals
import breeze.numerics.abs
import org.apache.spark.rdd.RDD
import org.apache.spark.sql.functions.col
import org.apache.spark.sql.{Column, DataFrame, Row}
/**
* Originally created by Umberto on 06/02/2017 (https://gist.github.com/umbertogriffo/112a02848d8be269f23757c9656df908). Added minor fix by alexkon.
*/
object DataFrameSuite {