Skip to content

Instantly share code, notes, and snippets.

@bkreider
Forked from quasiben/spark-python-version.py
Created June 26, 2018 15:59
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bkreider/431b789d2e1a9e7bacfa213c7dba7ad5 to your computer and use it in GitHub Desktop.
Save bkreider/431b789d2e1a9e7bacfa213c7dba7ad5 to your computer and use it in GitHub Desktop.
from pyspark import SparkContext
from pyspark import SparkConf
import os
import sys
if __name__ == "__main__":
conf = SparkConf()
conf.setAppName("version-check")
sc = SparkContext(conf=conf)
print(sc.defaultParallelism)
def noop(x):
import socket
import sys
return socket.gethostname() + ' '.join(sys.path) + ' '.join(os.environ)
rdd = sc.parallelize(range(1000), 100)
hosts = rdd.map(noop).distinct().collect()
print(hosts)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment