Skip to content

Instantly share code, notes, and snippets.

@vipmax
Created November 10, 2022 10:38
Show Gist options
  • Star 6 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save vipmax/0a6cb395ef82c277e323af967daf297d to your computer and use it in GitHub Desktop.
Save vipmax/0a6cb395ef82c277e323af967daf297d to your computer and use it in GitHub Desktop.
scala 3 and apache spark

scala 3 and apache spark

1. Install The Scala installer is a tool named Coursier

mac os

brew install coursier/formulas/coursier && cs setup

linux

curl -fL https://github.com/coursier/launchers/raw/master/cs-x86_64-pc-linux.gz | gzip -d > cs && chmod +x cs && ./cs setup

windows

https://github.com/coursier/launchers/raw/master/cs-x86_64-pc-win32.zip

2. Install java 11, spark supports java 8 and 11

Install:

cs java --jvm 11 --setup

** JAVA_HOME will locate at **

cs java --jvm 11 --env
The default directory for JVMs managed by coursier is OS-specific:
Linux: ~/.cache/coursier/jvm
macOS: ~/Library/Caches/Coursier/jvm
Windows: %LOCALAPPDATA%\Coursier\Cache\jvm, typically looking like C:\Users\Alex\AppData\Local\Coursier\Cache\jvm

3. Create new project

sbt new scala/scala3.g8

4. Open project with IDEA or VSCODE

Using IntelliJ IDEA

Download and install IntelliJ Community Edition Install the Scala plugin by following the instructions on how to install IntelliJ plugins Open the build.sbt file then choose Open as a project

Using VSCode with metals

Download VSCode Install the Metals extension from the Marketplace Next, open the directory containing a build.sbt file (this should be the directory hello-world if you followed the previous instructions). When prompted to do so, select Import build.

5. Add spark deps to build.sbt

val sparkVersion = "3.3.1"

libraryDependencies ++= Seq(
  ("org.apache.spark" %% "spark-core" % sparkVersion),
  ("org.apache.spark" %% "spark-sql" % sparkVersion),
  ("org.apache.spark" %% "spark-streaming" % sparkVersion),
  ("org.apache.spark" %% "spark-streaming-kafka-0-10" % sparkVersion),
  ("org.apache.spark" %% "spark-sql-kafka-0-10" % sparkVersion)
).map(_.cross(CrossVersion.for3Use2_13))

6. Create new file HelloSpark.scala and paste code

import org.apache.spark.sql.SparkSession

@main def helloSpark() = {
    
    val spark = SparkSession.builder().appName("helloSpark").master("local[*]").getOrCreate()
    spark.sparkContext.setLogLevel("WARN")

    spark.read.text("build.sbt").show()

}

7. Run using IDE button or command

sbt "runMain helloSpark"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment