vipmax/scala 3 and apache spark.md

## scala 3 and apache spark.md

      
    Raw
  

              scala 3 and apache spark.md
            
          
    scala 3 and apache spark

1. Install The Scala installer is a tool named Coursier

mac os

brew install coursier/formulas/coursier && cs setup
linux

curl -fL https://github.com/coursier/launchers/raw/master/cs-x86_64-pc-linux.gz | gzip -d > cs && chmod +x cs && ./cs setup
windows

https://github.com/coursier/launchers/raw/master/cs-x86_64-pc-win32.zip

2. Install java 11, spark supports java 8 and 11

Install:
cs java --jvm 11 --setup
** JAVA_HOME will locate at **
cs java --jvm 11 --env
The default directory for JVMs managed by coursier is OS-specific:

Linux: ~/.cache/coursier/jvm

macOS: ~/Library/Caches/Coursier/jvm

Windows: %LOCALAPPDATA%\Coursier\Cache\jvm, typically looking like C:\Users\Alex\AppData\Local\Coursier\Cache\jvm

3. Create new project

sbt new scala/scala3.g8
4. Open project with IDEA or VSCODE

Using IntelliJ IDEA

Download and install IntelliJ Community Edition
Install the Scala plugin by following the instructions on how to install IntelliJ plugins
Open the build.sbt file then choose Open as a project
Using VSCode with metals

Download VSCode
Install the Metals extension from the Marketplace
Next, open the directory containing a build.sbt file (this should be the directory hello-world if you followed the previous instructions). When prompted to do so, select Import build.
5. Add spark deps to build.sbt

val sparkVersion = "3.3.1"

libraryDependencies ++= Seq(
  ("org.apache.spark" %% "spark-core" % sparkVersion),
  ("org.apache.spark" %% "spark-sql" % sparkVersion),
  ("org.apache.spark" %% "spark-streaming" % sparkVersion),
  ("org.apache.spark" %% "spark-streaming-kafka-0-10" % sparkVersion),
  ("org.apache.spark" %% "spark-sql-kafka-0-10" % sparkVersion)
).map(_.cross(CrossVersion.for3Use2_13))
6. Create new file HelloSpark.scala and paste code

import org.apache.spark.sql.SparkSession

@main def helloSpark() = {
    
    val spark = SparkSession.builder().appName("helloSpark").master("local[*]").getOrCreate()
    spark.sparkContext.setLogLevel("WARN")

    spark.read.text("build.sbt").show()

}
7. Run using IDE button or command

sbt "runMain helloSpark"