Skip to content

Instantly share code, notes, and snippets.

Created February 9, 2022 02:52
Show Gist options
  • Save dboyliao/21beb6607f4dfa6ba64237ae7f428bc1 to your computer and use it in GitHub Desktop.
Save dboyliao/21beb6607f4dfa6ba64237ae7f428bc1 to your computer and use it in GitHub Desktop.
Spark Simple Example
// The simplest possible sbt build file is just one line:
scalaVersion := "2.13.3"
// That is, to create a valid sbt build, all you've got to do is define the
// version of Scala you'd like your project to use.
// ============================================================================
// Lines like the above defining `scalaVersion` are called "settings". Settings
// are key/value pairs. In the case of `scalaVersion`, the key is "scalaVersion"
// and the value is "2.13.3"
// It's possible to define many kinds of settings, such as:
name := "hello-world"
organization := "ch.epfl.scala"
version := "1.0"
// Note, it's not required for you to define these three settings. These are
// mostly only necessary if you intend to publish your library's binaries on a
// place like Sonatype.
// Want to use a published library in your project?
// You can define other libraries as dependencies in your build like this:
libraryDependencies += "org.scala-lang.modules" %% "scala-parser-combinators" % "1.1.2"
libraryDependencies += "org.apache.spark" %% "spark-core" % "3.2.0"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "3.2.0"
import org.apache.spark.sql.{SparkSession, Row, types => T}
import org.apache.log4j.{Logger, Level}
object Main extends App {
val spark = SparkSession
val data = List("Hello, world", "I'm running Spark!")
val msgDF = spark.createDataFrame(
spark.sparkContext.makeRDD( => Row(x))),
schema = T.StructType(Array(T.StructField("msg", T.StringType)))
msgDF.foreach((row: Row) => println(row.getAs("msg")))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment