Skip to content

Instantly share code, notes, and snippets.

@jbeezley
Last active August 22, 2016 18:30
Show Gist options
  • Save jbeezley/66a412318b2e456073a5da169b41b232 to your computer and use it in GitHub Desktop.
Save jbeezley/66a412318b2e456073a5da169b41b232 to your computer and use it in GitHub Desktop.

Persisting layer style

This document serves to describe requirements desired for describing and persisting styling of layers displayed in Minerva. Different kinds of data have different requirements so it may be useful to describe seperate style objects per data type. For example, vector layers (points, lines, polygons) focus more on discrete styling based on categorical values where gridded data is more focused on lookup tables on numeric values.

Vector features

Vector features are designed around the GeoJSON specification. Each of these feature types have a geometry as well as a "property" object which can in principle be an arbitrary JSON object. For the purposes of this document, we assume that the properties object contains key value pairs where each value is a fundamental type (number, string, boolean). This limitation makes it easier to parse the properties generically. In the future, it may be worthwhile to use the simplestyle standard described by https://github.com/mapbox/simplestyle-spec.

points

Points are meant to describe a single point on a layer. Other tools display this kind of data using icons like push pins with popup on click displaying a table of serialized properties. Other strategies used include displaying each as a circle with customizable radius, stroke, fill, and opacity. Preprocessing steps like hierarchical clustering or heatmaps are also often used when displaying large quantities of this kind of data.

styles:

  • icon - discrete, enumeration
  • raidus - continuous, size
  • fill - discrete, boolean
  • fill color - continuous, color
  • fill opacity - continuous, opacity
  • stroke - discrete, boolean
  • stroke width - continuous, size
  • stroke color - continuous, color
  • stroke opacity - continuous, opacity

lines

Lines usually describe an open path as such they don't have any fill properties. Examples of lines include rivers, roads, and trails. These styles could possibly be mapped per vertex on the line or constant along each geometry.

styles:

  • stroke - discrete, boolean
  • stroke width - continous, size
  • stroke color - continuous, color
  • stroke opacity - continuous, opacity
  • stroke pattern - discrete, enumerated

polygons

Polygons describe an area or border as a closed path. They can have a single outer border along with zero or more internal borders (holes). Polygons describe objects like political borders and coast lines. As with lines it is possible for styles to be mapped per vertex or even per border (interior and exterior).

styles:

  • fill - discrete, boolean
  • fill color - continuous, color
  • fill opacity - continuous, opacity
  • fill pattern - discrete, enumerated
  • stroke - discrete, boolean
  • stroke width - continous, size
  • stroke color - continuous, color
  • stroke opacity - continuous, opacity
  • stroke pattern - discrete, enumerated

Gridded features

Gridded data describes properties over a continuous area or region. This kind of data could be defined on a square grid (image), a structured grid (projected image), or even unstructured grid (computational simulation). Values on the grid are given as one or more numeric values (channels) which may be enumerated or named. Typical datasets of this type include topological data (height above sea level), satellite imagery, radar data, and climate simulations.

There are many complications unique to this kind of dataset. Gridded data often needs to be reprojected from its native projection into the display projection. This is a non-trivial operation that is usually better handled by a specialized library like GDAL. Gridded data is also typically much larger than vector datasets. Strategies for handling this usually involve server side processing like regridding or pyramidal tiling.

RGB images

Usually an image with RGB data will directly rendered as is; however, some amount of processing can also be done including contouring and feature extraction.

Multispectral images

A multispectral image will contain multiple bands of data containing intensities of specific bands of frequencies. Options for rendering this kind of dataset include direct mapping of channels to red, green, and blue. It is possible to generate false color renderings by creating color lookup tables depending on one or more bands from the image. Derived products can also be created from these datasets that are complicated functions of the input bands. The most simple kind of derived quantity will be a localized function operating on individual pixels. More complicated postprocessing involving non-local fuctions would best be handled on the server.

Computational simulations

Geophyscial simulations typically will output multiple arrays of data (variables) on a well defined grid in multiple time steps. Server side these are often stored in a single, self describing binary file (NetCDF, HDF, GRIB, etc.). The data can represent scalar quantities, fluxes, and vector fields in two or three dimensions. There are huge numbers of options for rendering these datasets. In many cases, it is appropriate to generate vector features server side. For very large datasets, it will necessary to do regridding to display scale for all but the most simple use cases.

For Minerva, it will be important to set a distinction between what is considered data processing and what is a rendering option. Data processing will probably result in the creation of a new dataset, while rendering options act on a dataset to produce a specific visual effect.

Data types

Because we wish to persist style information, the methods used to generate them should be serializable somehow. We will define a style generator as a function that maps a feature property into style property. Different types of style generators exist for each type of input and output, and each type requires different metadata to serialize itself.

Continuous values

A continuous property type will be defined as a set of values that can be interpolated continuously. For example, numeric values 0 and 1 could be interpolated to 0.5. Specifically, he set of all possible values should form a convex metric space where it is possible to add, subtract, and measure the distance between values.

Categorical values

A categorical property is generally a set of values for which there is no meaningful interpretation of "interpolation". The set can be ordered such as an ordinal ranking or unordered such as a place name.

Data types

Data properties are the metadata associated with a geometry in a vector dataset or a pixel in a gridded dataset. The data type of these properties determine the domain of the generators that act on them.

numeric

A numeric property could take many forms such as population counts, areas, temperatures, or ordinal rankings or categories. A numeric property may be continuous or categorical.

string

A property value provided as a string is often categorical, but it could also be a string representation of a different type such as a color or date. This provides a challenge for type introspection.

date/time

Dates and times are usually provided as strings, but represent continuous values. Dates and times often require special handling for labeling, binning, and parsing.

vector

Multi dimensional numeric types could come in the form of velocities or colors. These share many of the same properties of continuous numeric types, but lack a well defined total order.

style value types

The following data types are provided as values to style properties that exist on datasets. These properties are given per "marker" i.e. for each point in a point layer or each polygon in a polygon layer. These are the range of the style generator functions.

  • color - A single color value given as a string (i.e. '#012fed')
  • opacity - An opacity given as a number between 0 and 1
  • boolean - Indicate presence or absence of a component
  • width - Indicate the width of a component in pixels as a positive number
  • enumeration - A choice between one or more prepopulated values

Style generators

A generator is a serializable function similar to a d3 scale. Generators come in several different forms each taking varying numbers of attributes that determine the function that they compute.

common attributes

All generators will contain the following attributes that are used by general utility methods.

source

This is the key used as the source for the generator in the data property object. For example, given GeoJSON features like this:

{
  "type": "Polygon",
  "geometry": {},
  "properties": {
    "population": 10000
  }
}

A source attribute of population would scale the style by the population. Different source data types (like gridded data) might use this property differently. For example, it could represent a variable name in a multivariable dataset or band in a multispectral image.

target

The target style attribute that the generator produces. This could be any value that is valid for the feature type. For example, a value of fillColor would cause the generator to set the fill color of a polygon feature.

default

A value of the output type of the generator that is used by default when no other rules apply.

continuous -> continuous

These generators map a continuous domain into a continuous range. The domain should be totally ordered. Possible type signature include the following:

domain:

  • numeric
  • date/time

range:

  • opacity
  • color
  • width

transform

A string describing the interpolation method between values. Defaults to linear that will create a piecewise linear generator. Other possible values include power and log. Other types may add optional attributes to the schema; for example, log could add a base attribute that sets the base of the interpolating logarithm.

domain

An array of two or more values from the domain of the generator. The values should be unique and ordered from minimum to maximum. For the typical use case, this will be a two element array containing the minimum and maximum value of the domain. Providing additional values will create a piecewise interpolating function.

range

An array of values from the range with a number of elements equal to the length of the domain attribute. These values are the range in the interpolating function.

clamp

A boolean value indicating whether values passed to the generator should be clamping to the extent of the domain (default true). Passing false may result in invalid values returned by the generator for certain transforms.

categorical -> discrete

Categorical generators directly map values from the domain to the range without interpolation.

domain:

  • string
  • number (ordinal)

range:

  • number
  • color
  • boolean
  • enumeration

domain

An array of one or more value from the domain of the generator. The position of an element in this array is mapped to the value in the range array in the same position.

range

An array of values from the range of the generator with length equal to the domain array.

continuous -> discrete

Generating discrete style attributes from continuous data can be viewed as a preprocessing step before executing a categorical generator. As such, the generator type derives from the categorical generator, adding an attribute that describes how to break the domain into categories.

domain

This is an array of two or more values from the domain of the generator providing a series of "bins" defining the categorical partitioning. The first element of this array is the minimum value of the smallest category; the second value is the maximum value (not inclusive) of the smallest category and the minimum value of the next category.

range

This is an array of elements in the range of the generator of length one less than the length of the domain array. This provides the returned value of the corresponding category.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment