Skip to content

Instantly share code, notes, and snippets.

@nuhil
Created February 24, 2021 20:51
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save nuhil/9fdaf69f3135e33d5fb110e0ba194d88 to your computer and use it in GitHub Desktop.
Save nuhil/9fdaf69f3135e33d5fb110e0ba194d88 to your computer and use it in GitHub Desktop.
!apt-get update
!wget -q https://mirror.softaculous.com/apache/spark/spark-3.0.2/spark-3.0.2-bin-hadoop2.7.tgz
!tar xf spark-3.0.2-bin-hadoop2.7.tgz
!pip install -q findspark
import os
# Optional JDK look up step
# os.environ["JAVA_HOME"] = "/usr/lib/jvm/java-11-openjdk-amd64"
os.environ["SPARK_HOME"] = "/content/spark-3.0.2-bin-hadoop2.7"
import findspark
findspark.init()
import pyspark
from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment