Skip to content

Instantly share code, notes, and snippets.

@jpolchlo
Last active December 11, 2017 09:12
Show Gist options
  • Save jpolchlo/77ae121627a6e28b9b3d2f145a69acb7 to your computer and use it in GitHub Desktop.
Save jpolchlo/77ae121627a6e28b9b3d2f145a69acb7 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# GeoTrellis and Jupyter Scala\n",
"\n",
"This notebook is intended to demonstrate the use of GeoTrellis in Jupyter notebook using its native Scala library code. This is not an entirely straightforward translation due to a few peculiarities introduced by the lack of the typical SBT (Scala Build Tool) environment.\n",
"\n",
"## Library Imports\n",
"\n",
"In lieu of a `build.sbt` file to define library dependencies, Ammonite (which is the underpinning of this Jupyter scala kernel) introduces a custom `import` syntax that allows for dependencies to be declared directly in our Scala scripts. Note that Java deps follow the `[org]:[pkg]:[version]` pattern while Scala deps use the `[org]::[pkg]:[version]` pattern."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import $exclude.`org.slf4j:slf4j-log4j12`, $ivy.`org.slf4j:slf4j-nop:1.7.21` // Quiets the logger\n",
"import $profile.`hadoop-2.6`\n",
"import $ivy.`org.apache.spark::spark-sql:2.1.0`\n",
"import $ivy.`org.apache.hadoop:hadoop-aws:2.6.4`\n",
"import $ivy.`org.jupyter-scala::spark:0.4.2`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"When we are required to download from a non-standard repository, we must declare a resolver. The following creates one for the LocationTech repo."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import ammonite._, Resolvers._\n",
"\n",
"val locationtech = Resolver.Http(\n",
" \"locationtech-releases\",\n",
" \"https://repo.locationtech.org/content/groups/releases\",\n",
" MavenPattern,\n",
" true // Declares whether the organization is dot- (false) or slash- (true) delimited \n",
")\n",
"\n",
"interp.resolvers() = interp.resolvers() :+ locationtech"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, we are free to bring in the GeoTrellis library dependencies."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import $ivy.`org.locationtech.geotrellis::geotrellis-spark-etl:1.2.0-RC1`"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Using Spark"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import org.apache.spark._\n",
"import org.apache.spark.rdd.RDD\n",
"import jupyter.spark.session._"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The following odd syntax is the `jupyter-scala`-specific method for creating a SparkSession."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"val sparkSession = \n",
" JupyterSparkSession\n",
" .builder()\n",
" .jupyter() // Must be called immediately after builder()\n",
" .master(\"local[*]\")\n",
" .appName(\"testing\")\n",
" .getOrCreate()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"implicit val sc = sparkSession.sparkContext"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sc.parallelize((1 to 100).toArray[Int]).reduce(_+_)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Using GeoTrellis"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import geotrellis.vector._\n",
"import geotrellis.raster._\n",
"import geotrellis.spark._\n",
"import geotrellis.spark.tiling._"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"val ex = Extent(0,0,3,3)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"val tile = IntArrayTile.empty(40,10).map{ (x, y, _) => x/2 + y }"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"tile.renderAscii(geotrellis.raster.render.ascii.AsciiArtEncoder.Palette.FILLED)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"val ld = LayoutDefinition(ex, TileLayout(3, 3, 4, 4))\n",
"def tileAt(x: Int, y: Int) = { SpatialKey(x, y) -> IntArrayTile.fill(math.max(x, y), ld.tileCols, ld.tileRows) }\n",
"val tiles: RDD[(SpatialKey, Tile)] = sc.parallelize(for { x <- 0 to 2; y <- 0 to 2 } yield tileAt(x, y))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import geotrellis.spark.stitch._"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"tiles.stitch.renderAscii(geotrellis.raster.render.ascii.AsciiArtEncoder.Palette.FILLED)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Scala",
"language": "scala",
"name": "scala"
},
"language_info": {
"codemirror_mode": "text/x-scala",
"file_extension": ".scala",
"mimetype": "text/x-scala",
"name": "scala211",
"nbconvert_exporter": "script",
"pygments_lexer": "scala",
"version": "2.11.11"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment