Skip to content

Instantly share code, notes, and snippets.

@bigorn0
Last active August 24, 2016 21:08
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bigorn0/074dcbb634cbe5b906941dead100620e to your computer and use it in GitHub Desktop.
Save bigorn0/074dcbb634cbe5b906941dead100620e to your computer and use it in GitHub Desktop.
Run_spark_submit_from_scala.md

Running spark jogs from scala code

http://henningpetersen.com/post/22/running-apache-spark-jobs-from-applications

Example out of spark test classes

// NOTE: This is an expensive operation in terms of time (10 seconds+). Use sparingly.
  private def runSparkSubmit(args: Seq[String]): Unit = {
    val sparkHome = sys.props.getOrElse("spark.test.home", fail("spark.test.home is not set!"))
    val process = Utils.executeCommand(
      Seq("./bin/spark-submit") ++ args,
      new File(sparkHome),
      Map("SPARK_TESTING" -> "1", "SPARK_HOME" -> sparkHome))

    try {
      val exitCode = failAfter(60 seconds) { process.waitFor() }
      if (exitCode != 0) {
        fail(s"Process returned with exit code $exitCode. See the log4j logs for more detail.")
      }
    } finally {
      // Ensure we still kill the process in case it timed out
      process.destroy()
    }
  }
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment