Skip to content

Instantly share code, notes, and snippets.

@zsxwing
Last active August 29, 2015 14:05
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save zsxwing/802cade0facb36a37656 to your computer and use it in GitHub Desktop.
Save zsxwing/802cade0facb36a37656 to your computer and use it in GitHub Desktop.
This example can work.
scala> class Foo { def foo() = Array(1.0) }
defined class Foo
scala> var m: Array[Double] = null
m: Array[Double] = null
scala> {
| val t = new Foo
| m = t.foo
| }
scala> val r1 = sc.parallelize(List(1, 2, 3))
r1: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[0] at parallelize at <console>:12
scala> val r2 = r1.map(_ + m(0))
r2: org.apache.spark.rdd.RDD[Double] = MappedRDD[1] at map at <console>:16
scala> r2.toArray
14/08/14 22:45:31 INFO SparkContext: Starting job: toArray at <console>:19
14/08/14 22:45:31 INFO DAGScheduler: Got job 0 (toArray at <console>:19) with 1 output partitions (allowLocal=false)
14/08/14 22:45:31 INFO DAGScheduler: Final stage: Stage 0 (toArray at <console>:19)
14/08/14 22:45:31 INFO DAGScheduler: Parents of final stage: List()
14/08/14 22:45:31 INFO DAGScheduler: Missing parents: List()
14/08/14 22:45:31 INFO DAGScheduler: Submitting Stage 0 (MappedRDD[1] at map at <console>:16), which has no missing parents
14/08/14 22:45:31 INFO DAGScheduler: Submitting 1 missing tasks from Stage 0 (MappedRDD[1] at map at <console>:16)
14/08/14 22:45:31 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
14/08/14 22:45:31 INFO TaskSetManager: Starting task 0.0:0 as TID 0 on executor localhost: localhost (PROCESS_LOCAL)
14/08/14 22:45:31 INFO TaskSetManager: Serialized task 0.0:0 as 1867 bytes in 4 ms
14/08/14 22:45:31 INFO Executor: Running task ID 0
14/08/14 22:45:31 INFO Executor: Serialized size of result for 0 is 532
14/08/14 22:45:31 INFO Executor: Sending result for 0 directly to driver
14/08/14 22:45:31 INFO Executor: Finished task ID 0
14/08/14 22:45:31 INFO TaskSetManager: Finished TID 0 in 157 ms on localhost (progress: 0/1)
14/08/14 22:45:31 INFO TaskSchedulerImpl: Remove TaskSet 0.0 from pool
14/08/14 22:45:31 INFO DAGScheduler: Completed ResultTask(0, 0)
14/08/14 22:45:31 INFO DAGScheduler: Stage 0 (toArray at <console>:19) finished in 0.164 s
14/08/14 22:45:31 INFO SparkContext: Job finished: toArray at <console>:19, took 0.25781 s
res1: Array[Double] = Array(2.0, 3.0, 4.0)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment