Skip to content

Instantly share code, notes, and snippets.

@csbond007
Created October 24, 2016 23:37
Show Gist options
  • Save csbond007/c8f79d8e781e05a01a3d58e4edde1a18 to your computer and use it in GitHub Desktop.
Save csbond007/c8f79d8e781e05a01a3d58e4edde1a18 to your computer and use it in GitHub Desktop.
[ksaha@mesos101 SampleApp]$ spark-submit --class "SampleApp" --master mesos://zk://10.10.40.138:2181/mesos --jars lib/spark-cassandra-connector-1.6.1-s_2.10.jar,lib/cassandra-driver-core-3.1.1.jar, target/scala-2.10/sampleapp_2.10-1.0.jar
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
16/10/24 19:34:25 INFO SparkContext: Running Spark version 1.6.2
16/10/24 19:34:25 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/10/24 19:34:25 INFO SecurityManager: Changing view acls to: ksaha
16/10/24 19:34:25 INFO SecurityManager: Changing modify acls to: ksaha
16/10/24 19:34:25 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(ksaha); users with modify permissions: Set(ksaha)
16/10/24 19:34:25 INFO Utils: Successfully started service 'sparkDriver' on port 34208.
16/10/24 19:34:26 INFO Slf4jLogger: Slf4jLogger started
16/10/24 19:34:26 INFO Remoting: Starting remoting
16/10/24 19:34:26 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@10.10.40.138:36863]
16/10/24 19:34:26 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 36863.
16/10/24 19:34:26 INFO SparkEnv: Registering MapOutputTracker
16/10/24 19:34:26 INFO SparkEnv: Registering BlockManagerMaster
16/10/24 19:34:26 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-53f70884-dafc-45ea-a9b6-02cfa2ed7443
16/10/24 19:34:26 INFO MemoryStore: MemoryStore started with capacity 511.1 MB
16/10/24 19:34:26 INFO SparkEnv: Registering OutputCommitCoordinator
16/10/24 19:34:26 INFO Utils: Successfully started service 'SparkUI' on port 4040.
16/10/24 19:34:26 INFO SparkUI: Started SparkUI at http://10.10.40.138:4040
16/10/24 19:34:26 INFO HttpFileServer: HTTP File server directory is /tmp/spark-a9b2b831-26cb-4397-b51c-7ef257cb9d5b/httpd-d6f73cb0-1cf5-4bdb-9832-0387c365479c
16/10/24 19:34:26 INFO HttpServer: Starting HTTP Server
16/10/24 19:34:26 INFO Utils: Successfully started service 'HTTP file server' on port 38223.
16/10/24 19:34:26 INFO SparkContext: Added JAR file:/home/ksaha/spark_sbt_eclipse_cassandra/SampleApp/lib/spark-cassandra-connector-1.6.1-s_2.10.jar at http://10.10.40.138:38223/jars/spark-cassandra-connector-1.6.1-s_2.10.jar with timestamp 1477352066701
16/10/24 19:34:26 INFO SparkContext: Added JAR file:/home/ksaha/spark_sbt_eclipse_cassandra/SampleApp/lib/cassandra-driver-core-3.1.1.jar at http://10.10.40.138:38223/jars/cassandra-driver-core-3.1.1.jar with timestamp 1477352066705
16/10/24 19:34:26 INFO SparkContext: Added JAR file:/home/ksaha/spark_sbt_eclipse_cassandra/SampleApp/target/scala-2.10/sampleapp_2.10-1.0.jar at http://10.10.40.138:38223/jars/sampleapp_2.10-1.0.jar with timestamp 1477352066706
2016-10-24 19:34:26,820:17756(0x7fe64da94700):ZOO_INFO@log_env@726: Client environment:zookeeper.version=zookeeper C client 3.4.8
2016-10-24 19:34:26,820:17756(0x7fe64da94700):ZOO_INFO@log_env@730: Client environment:host.name=mesos101.itp.objectfrontier.com
2016-10-24 19:34:26,820:17756(0x7fe64da94700):ZOO_INFO@log_env@737: Client environment:os.name=Linux
2016-10-24 19:34:26,820:17756(0x7fe64da94700):ZOO_INFO@log_env@738: Client environment:os.arch=3.10.0-327.36.1.el7.x86_64
2016-10-24 19:34:26,820:17756(0x7fe64da94700):ZOO_INFO@log_env@739: Client environment:os.version=#1 SMP Sun Sep 18 13:04:29 UTC 2016
I1024 19:34:26.820709 17856 sched.cpp:226] Version: 1.0.1
2016-10-24 19:34:26,820:17756(0x7fe64da94700):ZOO_INFO@log_env@747: Client environment:user.name=ksaha
2016-10-24 19:34:26,820:17756(0x7fe64da94700):ZOO_INFO@log_env@755: Client environment:user.home=/home/ksaha
2016-10-24 19:34:26,820:17756(0x7fe64da94700):ZOO_INFO@log_env@767: Client environment:user.dir=/home/ksaha/spark_sbt_eclipse_cassandra/SampleApp
2016-10-24 19:34:26,820:17756(0x7fe64da94700):ZOO_INFO@zookeeper_init@800: Initiating client connection, host=10.10.40.138:2181 sessionTimeout=10000 watcher=0x7fe653017300 sessionId=0 sessionPasswd=<null> context=0x7fe6c8006dc0 flags=0
2016-10-24 19:34:26,826:17756(0x7fe64a98d700):ZOO_INFO@check_events@1728: initiated connection to server [10.10.40.138:2181]
2016-10-24 19:34:26,861:17756(0x7fe64a98d700):ZOO_INFO@check_events@1775: session establishment complete on server [10.10.40.138:2181], sessionId=0x157f59e05bd0052, negotiated timeout=10000
I1024 19:34:26.861804 17854 group.cpp:349] Group process (group(1)@10.10.40.138:46152) connected to ZooKeeper
I1024 19:34:26.861852 17854 group.cpp:837] Syncing group operations: queue size (joins, cancels, datas) = (0, 0, 0)
I1024 19:34:26.861871 17854 group.cpp:427] Trying to create path '/mesos' in ZooKeeper
I1024 19:34:26.863454 17853 detector.cpp:152] Detected a new leader: (id='9')
I1024 19:34:26.863548 17851 group.cpp:706] Trying to get '/mesos/json.info_0000000009' in ZooKeeper
I1024 19:34:26.863922 17853 zookeeper.cpp:259] A new leading master (UPID=master@10.10.40.138:5050) is detected
I1024 19:34:26.863982 17851 sched.cpp:330] New master detected at master@10.10.40.138:5050
I1024 19:34:26.864578 17851 sched.cpp:341] No credentials provided. Attempting to register without authentication
I1024 19:34:26.865934 17847 sched.cpp:743] Framework registered with 33ea2954-5fd5-494e-b4ad-8f1cb77fde51-0043
16/10/24 19:34:26 INFO CoarseMesosSchedulerBackend: Registered as framework ID 33ea2954-5fd5-494e-b4ad-8f1cb77fde51-0043
16/10/24 19:34:26 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 43089.
16/10/24 19:34:26 INFO NettyBlockTransferService: Server created on 43089
16/10/24 19:34:26 INFO BlockManagerMaster: Trying to register BlockManager
16/10/24 19:34:26 INFO BlockManagerMasterEndpoint: Registering block manager 10.10.40.138:43089 with 511.1 MB RAM, BlockManagerId(driver, 10.10.40.138, 43089)
16/10/24 19:34:26 INFO BlockManagerMaster: Registered BlockManager
16/10/24 19:34:27 INFO CoarseMesosSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
16/10/24 19:34:27 INFO SparkContext: Starting job: reduce at SampleApp.scala:24
16/10/24 19:34:27 INFO DAGScheduler: Got job 0 (reduce at SampleApp.scala:24) with 2 output partitions
16/10/24 19:34:27 INFO DAGScheduler: Final stage: ResultStage 0 (reduce at SampleApp.scala:24)
16/10/24 19:34:27 INFO DAGScheduler: Parents of final stage: List()
16/10/24 19:34:27 INFO DAGScheduler: Missing parents: List()
16/10/24 19:34:27 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at map at SampleApp.scala:20), which has no missing parents
16/10/24 19:34:27 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1856.0 B, free 1856.0 B)
16/10/24 19:34:27 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1210.0 B, free 3.0 KB)
16/10/24 19:34:27 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 10.10.40.138:43089 (size: 1210.0 B, free: 511.1 MB)
16/10/24 19:34:27 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1006
16/10/24 19:34:27 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at SampleApp.scala:20)
16/10/24 19:34:27 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks
16/10/24 19:34:29 INFO CoarseMesosSchedulerBackend: Mesos task 0 is now TASK_RUNNING
16/10/24 19:34:30 INFO CoarseMesosSchedulerBackend: Mesos task 1 is now TASK_RUNNING
16/10/24 19:34:31 INFO CoarseMesosSchedulerBackend: Registered executor NettyRpcEndpointRef(null) (mesos103.itp.objectfrontier.com:37040) with ID 440de647-93a5-4474-80ce-b3b60f10a459-S4/0
16/10/24 19:34:31 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, mesos103.itp.objectfrontier.com, partition 0,PROCESS_LOCAL, 2296 bytes)
16/10/24 19:34:31 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, mesos103.itp.objectfrontier.com, partition 1,PROCESS_LOCAL, 2353 bytes)
16/10/24 19:34:31 INFO BlockManagerMasterEndpoint: Registering block manager mesos103.itp.objectfrontier.com:46711 with 511.1 MB RAM, BlockManagerId(440de647-93a5-4474-80ce-b3b60f10a459-S4/0, mesos103.itp.objectfrontier.com, 46711)
16/10/24 19:34:32 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on mesos103.itp.objectfrontier.com:46711 (size: 1210.0 B, free: 511.1 MB)
16/10/24 19:34:32 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 927 ms on mesos103.itp.objectfrontier.com (1/2)
16/10/24 19:34:32 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 911 ms on mesos103.itp.objectfrontier.com (2/2)
16/10/24 19:34:32 INFO DAGScheduler: ResultStage 0 (reduce at SampleApp.scala:24) finished in 5.326 s
16/10/24 19:34:32 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
16/10/24 19:34:32 INFO DAGScheduler: Job 0 finished: reduce at SampleApp.scala:24, took 5.535412 s
///////////////////// Pi is roughly 3.144004
//////////////////////////////////
1
//////////////////////////////// emr_data.count() ///////////////////////
16/10/24 19:34:32 INFO CoarseMesosSchedulerBackend: Registered executor NettyRpcEndpointRef(null) (mesos102.itp.objectfrontier.com:53090) with ID 440de647-93a5-4474-80ce-b3b60f10a459-S3/1
16/10/24 19:34:32 INFO BlockManagerMasterEndpoint: Registering block manager mesos102.itp.objectfrontier.com:38863 with 511.1 MB RAM, BlockManagerId(440de647-93a5-4474-80ce-b3b60f10a459-S3/1, mesos102.itp.objectfrontier.com, 38863)
16/10/24 19:34:33 INFO NettyUtil: Found Netty's native epoll transport in the classpath, using it
16/10/24 19:34:33 INFO Cluster: New Cassandra host /10.10.40.172:9042 added
16/10/24 19:34:33 INFO LocalNodeFirstLoadBalancingPolicy: Added host 10.10.40.172 (datacenter1)
16/10/24 19:34:33 INFO Cluster: New Cassandra host /10.10.40.138:9042 added
16/10/24 19:34:33 INFO Cluster: New Cassandra host /10.10.40.36:9042 added
16/10/24 19:34:33 INFO LocalNodeFirstLoadBalancingPolicy: Added host 10.10.40.36 (datacenter1)
16/10/24 19:34:33 INFO CassandraConnector: Connected to Cassandra cluster: HealthCare_Cluster_2
16/10/24 19:34:33 INFO SparkContext: Starting job: count at SampleApp.scala:34
16/10/24 19:34:33 INFO DAGScheduler: Got job 1 (count at SampleApp.scala:34) with 4 output partitions
16/10/24 19:34:33 INFO DAGScheduler: Final stage: ResultStage 1 (count at SampleApp.scala:34)
16/10/24 19:34:33 INFO DAGScheduler: Parents of final stage: List()
16/10/24 19:34:33 INFO DAGScheduler: Missing parents: List()
16/10/24 19:34:33 INFO DAGScheduler: Submitting ResultStage 1 (CassandraTableScanRDD[2] at RDD at CassandraRDD.scala:15), which has no missing parents
16/10/24 19:34:33 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 7.1 KB, free 10.1 KB)
16/10/24 19:34:33 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 3.7 KB, free 13.9 KB)
16/10/24 19:34:33 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 10.10.40.138:43089 (size: 3.7 KB, free: 511.1 MB)
16/10/24 19:34:33 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1006
16/10/24 19:34:33 INFO DAGScheduler: Submitting 4 missing tasks from ResultStage 1 (CassandraTableScanRDD[2] at RDD at CassandraRDD.scala:15)
16/10/24 19:34:33 INFO TaskSchedulerImpl: Adding task set 1.0 with 4 tasks
16/10/24 19:34:33 INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 2, mesos102.itp.objectfrontier.com, partition 1,NODE_LOCAL, 21516 bytes)
16/10/24 19:34:33 INFO TaskSetManager: Starting task 3.0 in stage 1.0 (TID 3, mesos103.itp.objectfrontier.com, partition 3,NODE_LOCAL, 5068 bytes)
16/10/24 19:34:33 INFO TaskSetManager: Starting task 2.0 in stage 1.0 (TID 4, mesos102.itp.objectfrontier.com, partition 2,NODE_LOCAL, 20263 bytes)
16/10/24 19:34:33 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on mesos103.itp.objectfrontier.com:46711 (size: 3.7 KB, free: 511.1 MB)
16/10/24 19:34:35 INFO TaskSetManager: Finished task 3.0 in stage 1.0 (TID 3) in 1806 ms on mesos103.itp.objectfrontier.com (1/4)
16/10/24 19:34:36 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 5, mesos102.itp.objectfrontier.com, partition 0,ANY, 21614 bytes)
16/10/24 19:34:40 INFO CassandraConnector: Disconnected from Cassandra cluster: HealthCare_Cluster_2
16/10/24 19:34:50 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on mesos102.itp.objectfrontier.com:38863 (size: 3.7 KB, free: 511.1 MB)
16/10/24 19:34:54 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 5) in 17293 ms on mesos102.itp.objectfrontier.com (2/4)
16/10/24 19:34:55 INFO TaskSetManager: Finished task 2.0 in stage 1.0 (TID 4) in 21459 ms on mesos102.itp.objectfrontier.com (3/4)
16/10/24 19:34:55 INFO TaskSetManager: Finished task 1.0 in stage 1.0 (TID 2) in 21485 ms on mesos102.itp.objectfrontier.com (4/4)
16/10/24 19:34:55 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool
16/10/24 19:34:55 INFO DAGScheduler: ResultStage 1 (count at SampleApp.scala:34) finished in 21.486 s
16/10/24 19:34:55 INFO DAGScheduler: Job 1 finished: count at SampleApp.scala:34, took 21.523341 s
100001
16/10/24 19:34:55 INFO ContextCleaner: Cleaned accumulator 1
16/10/24 19:34:55 INFO ContextCleaner: Cleaned accumulator 2
16/10/24 19:34:55 INFO BlockManagerInfo: Removed broadcast_1_piece0 on 10.10.40.138:43089 in memory (size: 3.7 KB, free: 511.1 MB)
16/10/24 19:34:55 INFO BlockManagerInfo: Removed broadcast_1_piece0 on mesos103.itp.objectfrontier.com:46711 in memory (size: 3.7 KB, free: 511.1 MB)
16/10/24 19:34:55 INFO BlockManagerInfo: Removed broadcast_1_piece0 on mesos102.itp.objectfrontier.com:38863 in memory (size: 3.7 KB, free: 511.1 MB)
16/10/24 19:34:55 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 10.10.40.138:43089 in memory (size: 1210.0 B, free: 511.1 MB)
16/10/24 19:34:55 INFO BlockManagerInfo: Removed broadcast_0_piece0 on mesos103.itp.objectfrontier.com:46711 in memory (size: 1210.0 B, free: 511.1 MB)
16/10/24 19:34:55 INFO Cluster: New Cassandra host mesos103.itp.objectfrontier.com/10.10.40.172:9042 added
16/10/24 19:34:55 INFO Cluster: New Cassandra host mesos101.itp.objectfrontier.com/10.10.40.138:9042 added
16/10/24 19:34:55 INFO Cluster: New Cassandra host mesos102.itp.objectfrontier.com/10.10.40.36:9042 added
16/10/24 19:34:55 INFO CassandraConnector: Connected to Cassandra cluster: HealthCare_Cluster_2
16/10/24 19:34:55 INFO SparkContext: Starting job: take at CassandraRDD.scala:121
16/10/24 19:34:55 INFO DAGScheduler: Got job 2 (take at CassandraRDD.scala:121) with 1 output partitions
16/10/24 19:34:55 INFO DAGScheduler: Final stage: ResultStage 2 (take at CassandraRDD.scala:121)
16/10/24 19:34:55 INFO DAGScheduler: Parents of final stage: List()
16/10/24 19:34:55 INFO DAGScheduler: Missing parents: List()
16/10/24 19:34:55 INFO DAGScheduler: Submitting ResultStage 2 (CassandraTableScanRDD[3] at RDD at CassandraRDD.scala:15), which has no missing parents
16/10/24 19:34:55 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 7.4 KB, free 7.4 KB)
16/10/24 19:34:55 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 3.9 KB, free 11.3 KB)
16/10/24 19:34:55 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on 10.10.40.138:43089 (size: 3.9 KB, free: 511.1 MB)
16/10/24 19:34:55 INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1006
16/10/24 19:34:55 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 2 (CassandraTableScanRDD[3] at RDD at CassandraRDD.scala:15)
16/10/24 19:34:55 INFO TaskSchedulerImpl: Adding task set 2.0 with 1 tasks
16/10/24 19:34:55 INFO TaskSetManager: Starting task 0.0 in stage 2.0 (TID 6, mesos102.itp.objectfrontier.com, partition 0,ANY, 21614 bytes)
16/10/24 19:34:55 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on mesos102.itp.objectfrontier.com:38863 (size: 3.9 KB, free: 511.1 MB)
16/10/24 19:34:55 INFO TaskSetManager: Finished task 0.0 in stage 2.0 (TID 6) in 67 ms on mesos102.itp.objectfrontier.com (1/1)
16/10/24 19:34:55 INFO DAGScheduler: ResultStage 2 (take at CassandraRDD.scala:121) finished in 0.067 s
16/10/24 19:34:55 INFO TaskSchedulerImpl: Removed TaskSet 2.0, whose tasks have all completed, from pool
16/10/24 19:34:55 INFO DAGScheduler: Job 2 finished: take at CassandraRDD.scala:121, took 0.074943 s
/////////////////////////// firstRow.size ///////////////////////////////
7
///////////////////////////////////////////// firstRow.getString(patientdateofbirth)////////
1951-05-29 00:05:27.617
16/10/24 19:34:55 INFO SparkUI: Stopped Spark web UI at http://10.10.40.138:4040
16/10/24 19:34:55 INFO CoarseMesosSchedulerBackend: Shutting down all executors
16/10/24 19:34:55 INFO CoarseMesosSchedulerBackend: Asking each executor to shut down
I1024 19:34:55.610491 17784 sched.cpp:1987] Asked to stop the driver
I1024 19:34:55.610574 17849 sched.cpp:1187] Stopping framework '33ea2954-5fd5-494e-b4ad-8f1cb77fde51-0043'
16/10/24 19:34:55 INFO CoarseMesosSchedulerBackend: driver.run() returned with code DRIVER_STOPPED
16/10/24 19:34:55 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/10/24 19:34:55 INFO MemoryStore: MemoryStore cleared
16/10/24 19:34:55 INFO BlockManager: BlockManager stopped
16/10/24 19:34:55 INFO BlockManagerMaster: BlockManagerMaster stopped
16/10/24 19:34:55 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
16/10/24 19:34:55 INFO SparkContext: Successfully stopped SparkContext
16/10/24 19:34:55 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
16/10/24 19:34:55 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
16/10/24 19:34:55 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
16/10/24 19:35:02 INFO CassandraConnector: Disconnected from Cassandra cluster: HealthCare_Cluster_2
16/10/24 19:35:03 INFO ShutdownHookManager: Shutdown hook called
16/10/24 19:35:03 INFO ShutdownHookManager: Deleting directory /tmp/spark-a9b2b831-26cb-4397-b51c-7ef257cb9d5b/httpd-d6f73cb0-1cf5-4bdb-9832-0387c365479c
16/10/24 19:35:03 INFO ShutdownHookManager: Deleting directory /tmp/spark-a9b2b831-26cb-4397-b51c-7ef257cb9d5b
16/10/24 19:35:03 INFO SerialShutdownHooks: Successfully executed shutdown hook: Clearing session cache for C* connector
[ksaha@mesos101 SampleApp]$
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment