Skip to content

Instantly share code, notes, and snippets.

@philipsahli
Created March 1, 2018 11:27
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save philipsahli/b4a13bbb3c90d12d8756c3fd6f5306bd to your computer and use it in GitHub Desktop.
Save philipsahli/b4a13bbb3c90d12d8756c3fd6f5306bd to your computer and use it in GitHub Desktop.
```python
import findspark
findspark.init("/usr/local/Cellar/apache-spark/2.2.1/libexec/")
```
```python
sc
```
<div>
<p><b>SparkContext</b></p>
<p><a href="http://172.26.102.139:4040">Spark UI</a></p>
<dl>
<dt>Version</dt>
<dd><code>v2.2.1</code></dd>
<dt>Master</dt>
<dd><code>local[*]</code></dd>
<dt>AppName</dt>
<dd><code>myAppName</code></dd>
</dl>
</div>
```python
print sc
```
<SparkContext master=local[*] appName=myAppName>
```python
from pyspark.sql import SparkSession
spark = SparkSession(sc)
```
```python
df = spark.read.format("com.mongodb.spark.sql.DefaultSource").load()
```
```python
df.printSchema()
```
root
|-- _id: struct (nullable = true)
| |-- oid: string (nullable = true)
|-- datetime: timestamp (nullable = true)
|-- price: double (nullable = true)
|-- symbol: string (nullable = true)
```python
print df.count()
```
1405889
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment