Skip to content

Instantly share code, notes, and snippets.

@quasiben
Last active June 26, 2018 15:59
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save quasiben/2cb9ea45430a55c1fb310efed22b4f0b to your computer and use it in GitHub Desktop.
Save quasiben/2cb9ea45430a55c1fb310efed22b4f0b to your computer and use it in GitHub Desktop.
from pyspark import SparkContext
from pyspark import SparkConf
import os
import sys
if __name__ == "__main__":
conf = SparkConf()
conf.setAppName("version-check")
sc = SparkContext(conf=conf)
print(sc.defaultParallelism)
def noop(x):
import socket
import sys
return socket.gethostname() + ' '.join(sys.path) + ' '.join(os.environ)
rdd = sc.parallelize(range(1000), 100)
hosts = rdd.map(noop).distinct().collect()
print(hosts)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment