Skip to content

Instantly share code, notes, and snippets.

@soumitradev
Last active January 7, 2020 18:55
Show Gist options
  • Save soumitradev/23781d2ff99568ad01eb13bc1a09691e to your computer and use it in GitHub Desktop.
Save soumitradev/23781d2ff99568ad01eb13bc1a09691e to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# GIS: An ArchGDAL Overview\n",
"\n",
"![](https://upload.wikimedia.org/wikipedia/commons/4/4f/Kavraiskiy_VII_projection_SW.jpg)\n",
"\n",
"## 1.1 What are GIS Databases?\n",
"\n",
"GIS stands for Geographic Information System. GIS databases are databases that contain geographical data. For instance, a GIS database could contain location information about crimes and could help analyze previous crimes and prevent them in the future. Another example could be visualizing the outbreak of a disease, recognizing patterns in the data and preventing or limiting the spread of the disease.\n",
"\n",
"GIS databases also contain layers that can be accessed to collect different types of information for a given place. For example, we can visualize rainfall distribution, wind speeds, and the average temperature at a place and that could yield important meteorological results. All these different parameters would exist in different layers since they represent different types of data.\n",
"\n",
"GIS can give us deeper insights about issues that challenge the world, predict future events, help us understand trends and so much more. It truly is a versatile form of data that has a lot of potential.\n",
"\n",
"## 1.2 A Deeper Look\n",
"\n",
"GIS databases are of two types: vector and raster. Vector databases contain information about discrete objects such as lines, points, areas, labels, etc. Raster databases, on the other hand, contain continuous information such as heatmaps, distribution charts, imagery, etc.\n",
"\n",
"Vector databases are generally used to map and label things that have fixed boundaries and can be mapped discretely. This includes streets, states, buildings, etc.\n",
"\n",
"Raster databases are generally used to visualize distributions, satellite imagery, elevations, weather data, etc.\n",
"\n",
"## 1.3 Where does ArchGDAL come in?\n",
"\n",
"GDAL stands for Geospatial Data Abstraction Library, which is an open-source library that can process GIS data. Currently, the project is being maintained by OSGeo and is being updated frequently. It is released under the X/MIT license and is mainly written in C and C++.\n",
"\n",
"However, many translation libraries exist that can interface between a given programming language and GDAL. One example is ArchGDAL.jl which interfaces between GDAL and Julia. It is, in fact, a high-level API for GDAL and builds upon GDAL.jl.\n",
"\n",
"## 1.4 The Arch Principles\n",
"\n",
"ArchGDAL also follows the Arch principles while developing and maintaining their library. The principles are:\n",
"- **Simplicity**: without unnecessary additions or modifications. (i) Preserves GDAL Data Model, and makes available GDAL/OGR methods without trying to mask them from the user. (ii) minimal dependencies\n",
"\n",
"\n",
"- **Modernity**: ArchGDAL strives to maintain the latest stable release versions of GDAL as long as systemic package breakage can be reasonably avoided.\n",
"\n",
"\n",
"- **Pragmatism**: The principles here are only useful guidelines. Ultimately, design decisions are made on a case-by-case basis through developer consensus. Evidence-based technical analysis and debate are what matter, not politics or popular opinion.\n",
"\n",
"\n",
"- **User-Centrality**: Whereas other libraries attempt to be more user-friendly, ArchGDAL shall be user-centric. It is intended to fill the needs of those contributing to it, rather than trying to appeal to as many users as possible.\n",
"\n",
"\n",
"- **Versatility**: ArchGDAL will strive to remain small in its assumptions about the range of user needs, and to make it easy for users to build their own extensions/conveniences.\n",
"\n",
"\n",
"These principles are adapted from the original Arch Linux principles which can be found [here](https://wiki.archlinux.org/index.php/Arch_Linux#Principles).\n",
"\n",
"## 1.5 Installation\n",
"\n",
"To install ArchGDAL, just type:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import Pkg\n",
"\n",
"Pkg.add(\"ArchGDAL\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To test your installation, type:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"Pkg.test(\"ArchGDAL\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1.6 Getting Started\n",
"\n",
"Let us get started with working with databases in ArchGDAL.\n",
"We need to import ArchGDAL first to work with it:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"using ArchGDAL"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To initialize ArchGDAL, we must register our drivers. To do this, simply type:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ArchGDAL.registerdrivers() do\n",
" # your code here\n",
"end"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Or, you could do this in 2 lines:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"GDAL.allregister()\n",
"# your code here\n",
"GDAL.destroydrivermanager()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"However, I prefer the former and will use that method throughout this document.\n",
"\n",
"## 1.7 Working with GIS databases\n",
"\n",
"To start working with a database, we must read it into our script first.\n",
"So, after we add the `read()` function, our code will be:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ArchGDAL.registerdrivers() do\n",
" ArchGDAL.read(filepath) do dataset\n",
" print(dataset)\n",
" end\n",
"end"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Where `filepath` is the path to our database file.\n",
"\n",
"Now, it is important to note that vector databases are of type `.geojson` while Raster databases are of type `.tif`. It is also important to note that there are many types of file types other than geojson and tif that can be used with ArchGDAL such as `.shp`, `.shx`, but the example files on the examples repo for ArchGDAL only have geojson and tif files.\n",
"\n",
"Now, let us import a vector dataset. On the ArchGDAL.jl repo, an example dataset already exists; It can be found [here](https://github.com/yeesian/ArchGDALDatasets/blob/307f8f0e584a39a050c042849004e6a2bd674f99/data/point.geojson). I have placed the example dataset in the same folder as my Jupyter Notebook. To point to the dataset, I just typed the following above the do block:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"filepath = \"point.geojson\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, let us print some basic information about the database:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"using ArchGDAL\n",
"\n",
"filepath = \"point.geojson\"\n",
"\n",
"ArchGDAL.registerdrivers() do\n",
" ArchGDAL.read(filepath) do dataset\n",
" print(dataset)\n",
" end\n",
"end"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We get:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```\n",
"GDAL Dataset (Driver: GeoJSON/GeoJSON)\n",
"\n",
"File(s):\n",
" point.geojson\n",
"\n",
"Number of feature layers: 1\n",
" Layer 0: point (wkbPoint)\n",
" ```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We see that we get the type of the database, the files in our database, the number of layers in our database, and information about each layer. The programmatical way to access this information can be found in the [docs](http://yeesian.com/ArchGDAL.jl/latest/datasets.html#Vector-Datasets-1).\n",
"\n",
"Similarly, when we import the example raster database from [here](https://github.com/yeesian/ArchGDALDatasets/blob/307f8f0e584a39a050c042849004e6a2bd674f99/gdalworkshop/world.tif) and run the same script on it, we get the following output:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```\n",
"GDAL Dataset (Driver: GTiff/GeoTIFF)\n",
"\n",
"File(s):\n",
" world.tif\n",
" \n",
"Dataset (width x height): 2048 x 1024 (pixels)\n",
"\n",
"Number of raster bands: 3\n",
" [GA_ReadOnly] Band 1 (Red): 2048 x 1024 (UInt8)\n",
" [GA_ReadOnly] Band 2 (Green): 2048 x 1024 (UInt8)\n",
" [GA_ReadOnly] Band 3 (Blue): 2048 x 1024 (UInt8)\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We got the type of file, the files in our database, the resolution of our database, and the raster bands or layers of our database.\n",
"\n",
"Again, you can access this data programmatically and the instructions to do so are [here](http://yeesian.com/ArchGDAL.jl/latest/datasets.html#Raster-Datasets-1).\n",
"\n",
"## 1.8 Working with files\n",
"\n",
"ArchGDAL allows the following functions to work with files:\n",
"- `createcopy()`\n",
"- `update()`\n",
"- `create()`\n",
"- `read()`\n",
"\n",
"`createcopy()`: Creates a copy of the database passed as parameter.\n",
"`update()`: Opens a database with write permissions.\n",
"`create()`: Creates a new empty database.\n",
"\n",
"> **Note**: Sequential write formats (Like PNG and JPEG) do not support `create()`, but do implement `createcopy()`.\n",
"Also, some drivers only implement `create()`, but not `createcopy()`. In that case, `createcopy()` will be used as a mechanism when calling `create()`.\n",
"\n",
"`read()`: Opens the database in read-only mode.\n",
"\n",
"## 1.9 Accessing Data from Databases\n",
"\n",
"To access individual fields and features (such as lines, points, labels, etc.) in vector databases, we have some methods:\n",
"\n",
"`ArchGDAL.getgeomfield(feature, i)`: Gets the geometric field in the feature at index `i`.\n",
"\n",
"`ArchGDAL.getfield(feature, i)`: Gets the field in the feature at index `i`.\n",
"\n",
"`ArchGDAL.ngeomfield(feature)`: Gets the number of geometric fields in the `feature`.\n",
"\n",
"`ArchGDAL.nfield(feature)`: Gets the number of fields in the `feature`.\n",
"\n",
"In raster databases, we have many methods that get the attributes of the database. Going over all these methods is out of the scope of this introduction, and the docs for these methods can be found [here](http://yeesian.com/ArchGDAL.jl/latest/rasters.html).\n",
"\n",
"In conclusion, ArchGDAL is a very powerful tool and can be used for a variety of applications. This was only a brief exploration of the tool, and there is much more to this tool than just this. Check the repo out [here](https://github.com/yeesian/ArchGDAL.jl), and the docs for this tool [here](http://yeesian.com/ArchGDAL.jl/latest/).\n",
"\n",
"## 1.10 References\n",
"\n",
"Cover Image belongs to Strebe from Wikimedia Commons and is published under the [CC-BY-SA 3.0 license](https://creativecommons.org/licenses/by-sa/3.0/). The link to it is here: https://commons.wikimedia.org/wiki/File:Kavraiskiy_VII_projection_SW.jpg\n",
"\n",
"Introduction to GIS information is from ESRI: https://www.esri.com/en-us/what-is-gis/overview"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Julia 1.2.0",
"language": "julia",
"name": "julia-1.2"
},
"language_info": {
"file_extension": ".jl",
"mimetype": "application/julia",
"name": "julia",
"version": "1.2.0"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
This is licensed under the CC BY 3.0 Unported license by Soumitra Shewale (soumitradev) 2019 - 2020
https://creativecommons.org/licenses/by/3.0/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment