Skip to content

Instantly share code, notes, and snippets.

@yaravind
Last active October 30, 2020 02:43
Show Gist options
  • Star 6 You must be signed in to star a gist
  • Fork 3 You must be signed in to fork a gist
  • Save yaravind/d99474e0e42f0baf1d023591a9ddcaef to your computer and use it in GitHub Desktop.
Save yaravind/d99474e0e42f0baf1d023591a9ddcaef to your computer and use it in GitHub Desktop.
Submit apps (SparkPi as e.g.) to spark cluster using rest api
curl -X POST -d http://master-host:6066/v1/submissions/create --header "Content-Type:application/json" --data '{
"action": "CreateSubmissionRequest",
"appResource": "hdfs://localhost:9000/user/spark-examples_2.11-2.0.0.jar",
"clientSparkVersion": "2.0.0",
"appArgs": [ "10" ],
"environmentVariables" : {
"SPARK_ENV_LOADED" : "1"
},
"mainClass": "org.apache.spark.examples.SparkPi",
"sparkProperties": {
"spark.jars": "hdfs://localhost:9000/user/spark-examples_2.11-2.0.0.jar",
"spark.driver.supervise":"false",
"spark.executor.memory": "512m",
"spark.driver.memory": "512m",
"spark.submit.deployMode":"cluster",
"spark.app.name": "SparkPi",
"spark.master": "spark://master-host:6066"
}
}
@yaravind
Copy link
Author

yaravind commented Feb 6, 2017

Replace hdfs with file:/ if the jar is on local file system

@yaravind
Copy link
Author

yaravind commented Feb 6, 2017

to kill a submitted app

curl -X POST http://spark-cluster-ip:6066/v1/submissions/kill/driver-20170206232033-0003

@yaravind
Copy link
Author

yaravind commented Feb 6, 2017

to get status

curl http://spark-in-action:6066/v1/submissions/status/driver-20170206232033-0003

@yaravind
Copy link
Author

yaravind commented Feb 7, 2017

Need to have "spark.jars": "hdfs://localhost:9000/user/spark-examples_2.11-2.0.0.jar", and

 "environmentVariables" : {

    "SPARK_ENV_LOADED" : "1"

  },

for this job to successfully run

@yaravind
Copy link
Author

spark-submit --verbose --master spark://spark-in-action:7077
--class org.apache.spark.examples.SparkPi
$SPARK_HOME/examples/jars/spark-examples_2.11-2.0.0.jar

@Hammad-Raza
Copy link

How can I submit configuration through API like

--driver-java-options -Dconfig.file=

@Hammad-Raza
Copy link

How to pass application arguments and conf from these APIs

./bin/spark-submit
--class
--master
--deploy-mode
--conf =
... # other options

[application-arguments]

@mahsa-poorjafari
Copy link

@Hammad-Raza
Hi, I also need to pass an argument to my spark-job. And this is how I've solved it:

curl -X POST -d http://master-host:6066/v1/submissions/create --header "Content-Type:application/json" --data '{
  "action": "CreateSubmissionRequest",
  "appArgs": [ "Path/to/my/python/file", "arg1" ], <= Here you can send your arguments
  "appResource": "Path/to/my/python/file",
  "clientSparkVersion": "2.3.0",  
  "environmentVariables" : {
    "SPARK_ENV_LOADED" : "1"
  },
  "mainClass": "org.apache.spark.examples.SparkPi",
   "sparkProperties": {
            "spark.driver.supervise": "true",
            "spark.app.name": "My app",
            "spark.eventLog.enabled": "true",
            "spark.eventLog.dir": "file:/tmp/spark-events",
            "spark.submit.deployMode": "cluster",
            "spark.master": spark_url,
            "spark.ui.enabled": "true",
         }
}

Then inside my job which is a python file:

import sys
from pyspark.conf import SparkConf

print("sys.argv=>", sys.argv)

The output is:
sys.argv=> ['Path/to/my/python/file', 'arg1']

You can also see all the spark config by the following code

conf = SparkConf()
print(sorted(conf.getAll(), key=lambda p: p[0]) )

The output is:
[('spark.app.name', 'myfile.py'), ('spark.driver.cores', '4'), ('spark.driver.extraClassPath', '/jars/mysql-connector-java-8.0.11.jar'), ('spark.driver.supervise', 'true'), ('spark.eventLog.dir', 'file:/tmp/spark-events'), ('spark.eventLog.enabled', 'true'), ('spark.executor.extraClassPath', '/jars/mysql-connector-java-8.0.11.jar'), ('spark.executorEnv.JAVA_HOME', '/java-1.8.0-openjdk-1.8.0.171-8.b10.el7_5.x86_64'), ('spark.files', 'file:Path/to/my/python/file'), ('spark.master', 'spark://master-url:7077'), ('spark.submit.deployMode', 'client'), ('spark.ui.enabled', 'true')]

Hope this help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment