Skip to content

Instantly share code, notes, and snippets.

@Reedef
Created June 30, 2015 16:43
Show Gist options
  • Save Reedef/a1fc51816ac7a9fedd19 to your computer and use it in GitHub Desktop.
Save Reedef/a1fc51816ac7a9fedd19 to your computer and use it in GitHub Desktop.
Scala笔记
Scala NOTE
---
#### **Spark执行方式**
- 命令行模式
- spark-shell
- pyspark
- 脚本运行
- spark-submit user-script
----
####**数据集对象RDD**
**创建**
- `sc.textFile("hdfs:///home/logset")` hadoop hdfs
- `sc.textFile("file:///home/user")` 本地文件:要求每个节点都有这个文件,集群请用hdfs
- `sc.parallelize(List(1,2,3,3))`
**RDD操作 **
- [**Transformations**](https://spark.apache.org/docs/latest/programming-guide.html#transformations) 返回一个新的RDD
- [**Action**](https://spark.apache.org/docs/latest/programming-guide.html#actions) 返回value
----
####**参考**
- [Spark本地模式运行](http://blog.javachen.com/2015/03/30/spark-test-in-local-mode.html)
> Written with [StackEdit](https://stackedit.io/).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment