Skip to content

Instantly share code, notes, and snippets.

@arturmkrtchyan
Last active August 7, 2023 18:55
Show Gist options
  • Save arturmkrtchyan/5d8559b2911ac951d34a to your computer and use it in GitHub Desktop.
Save arturmkrtchyan/5d8559b2911ac951d34a to your computer and use it in GitHub Desktop.
Apache Spark Hidden REST API
curl http://spark-cluster-ip:6066/v1/submissions/status/driver-20151008145126-0000
{
"action" : "SubmissionStatusResponse",
"driverState" : "FINISHED",
"serverSparkVersion" : "1.5.0",
"submissionId" : "driver-20151008145126-0000",
"success" : true,
"workerHostPort" : "192.168.3.153:46894",
"workerId" : "worker-20151007093409-192.168.3.153-46894"
}
curl -X POST http://spark-cluster-ip:6066/v1/submissions/kill/driver-20151008145126-0000
{
"action" : "KillSubmissionResponse",
"message" : "Kill request for driver-20151008145126-0000 submitted",
"serverSparkVersion" : "1.5.0",
"submissionId" : "driver-20151008145126-0000",
"success" : true
}
curl -X POST http://spark-cluster-ip:6066/v1/submissions/create --header "Content-Type:application/json;charset=UTF-8" --data '{
"action" : "CreateSubmissionRequest",
"appArgs" : [ "myAppArgument1" ],
"appResource" : "file:/myfilepath/spark-job-1.0.jar",
"clientSparkVersion" : "1.5.0",
"environmentVariables" : {
"SPARK_ENV_LOADED" : "1"
},
"mainClass" : "com.mycompany.MyJob",
"sparkProperties" : {
"spark.jars" : "file:/myfilepath/spark-job-1.0.jar",
"spark.driver.supervise" : "false",
"spark.app.name" : "MyJob",
"spark.eventLog.enabled": "true",
"spark.submit.deployMode" : "cluster",
"spark.master" : "spark://spark-cluster-ip:6066"
}
}'
{
"action" : "CreateSubmissionResponse",
"message" : "Driver successfully submitted as driver-20151008145126-0000",
"serverSparkVersion" : "1.5.0",
"submissionId" : "driver-20151008145126-0000",
"success" : true
}
@LasseJacobs
Copy link

LasseJacobs commented Feb 21, 2023

Yes I was able to get it working in the end, I don't remember exactly but I think I needed to add "spark.driver.supervise": "true". Here is my full shell script to try it out:

curl -X POST http://cluster:6066/v1/submissions/create --header "Content-Type:application/json;charset=UTF-8" --data '{
  "appResource": "file:///path/to/file.jar",
  "sparkProperties": {
    "spark.executor.memory": "2g",
    "spark.master": "spark://spark-master:7077",
    "spark.driver.memory": "2g",
    "spark.driver.cores": "1",
    "spark.eventLog.enabled": "false",
    "spark.app.name": "Spark REST API - PI",
    "spark.submit.deployMode": "cluster",
    "spark.jars" : "file:///opt/bitnami/spark/apps/dwh-plumber-1.0.jar",
    "spark.driver.supervise": "true"
  },
  "clientSparkVersion": "3.2.0",
  "mainClass": "App",
  "environmentVariables": {
    "SPARK_ENV_LOADED": "1"
  },
  "action": "CreateSubmissionRequest",
  "appArgs": [ "" ]
}'

Let me know if this works for you or not, if it doesn't work I will remove it because I am not sure anymore if this was the only thing we had to do to get it to work.

@LasseJacobs
Copy link

Also, the rest API has to be enabled. Here is an example config:

apiVersion: v1
kind: ConfigMap
metadata:
  name: spark-master-config
data:
  spark-defaults.conf: |
    spark.master.rest.enabled true
    spark.driver.host spark-master
    spark.driver.port 7077

and then mount the volume:

 volumeMounts:
            - name: config-volume
              mountPath: /opt/bitnami/spark/conf/spark-defaults.conf
              subPath: spark-defaults.conf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment