This document serves to describe requirements desired for describing and persisting styling of layers displayed in Minerva. Different kinds of data have different requirements so it may be useful to describe seperate style objects per data type. For example, vector layers (points, lines, polygons) focus more on discrete styling based on categorical values where gridded data is more focused on lookup tables on numeric values.
Vector features are designed around the GeoJSON specification. Each of these feature types have a geometry as well as a "property" object which can in principle be an arbitrary JSON object. For the purposes of this document, we assume that the properties object contains key value pairs where each value is a fundamental type (number, string, boolean). This limitation makes it easier to parse the properties generically. In the future, it may be worthwhile to use the simplestyle standard described by https://github.com/mapbox/simplestyle-spec.
Points are meant to describe a single point on a layer. Other tools display this kind of data using icons like push pins with popup on click displaying a table of serialized properties. Other strategies used include displaying each as a circle with customizable radius, stroke, fill, and opacity. Preprocessing steps like hierarchical clustering or heatmaps are also often used when displaying large quantities of this kind of data.
styles:
- icon - discrete, enumeration
- raidus - continuous, size
- fill - discrete, boolean
- fill color - continuous, color
- fill opacity - continuous, opacity
- stroke - discrete, boolean
- stroke width - continuous, size
- stroke color - continuous, color
- stroke opacity - continuous, opacity
Lines usually describe an open path as such they don't have any fill properties. Examples of lines include rivers, roads, and trails. These styles could possibly be mapped per vertex on the line or constant along each geometry.
styles:
- stroke - discrete, boolean
- stroke width - continous, size
- stroke color - continuous, color
- stroke opacity - continuous, opacity
- stroke pattern - discrete, enumerated
Polygons describe an area or border as a closed path. They can have a single outer border along with zero or more internal borders (holes). Polygons describe objects like political borders and coast lines. As with lines it is possible for styles to be mapped per vertex or even per border (interior and exterior).
styles:
- fill - discrete, boolean
- fill color - continuous, color
- fill opacity - continuous, opacity
- fill pattern - discrete, enumerated
- stroke - discrete, boolean
- stroke width - continous, size
- stroke color - continuous, color
- stroke opacity - continuous, opacity
- stroke pattern - discrete, enumerated
Gridded data describes properties over a continuous area or region. This kind of data could be defined on a square grid (image), a structured grid (projected image), or even unstructured grid (computational simulation). Values on the grid are given as one or more numeric values (channels) which may be enumerated or named. Typical datasets of this type include topological data (height above sea level), satellite imagery, radar data, and climate simulations.
There are many complications unique to this kind of dataset. Gridded data often needs to be reprojected from its native projection into the display projection. This is a non-trivial operation that is usually better handled by a specialized library like GDAL. Gridded data is also typically much larger than vector datasets. Strategies for handling this usually involve server side processing like regridding or pyramidal tiling.
Usually an image with RGB data will directly rendered as is; however, some amount of processing can also be done including contouring and feature extraction.
A multispectral image will contain multiple bands of data containing intensities of specific bands of frequencies. Options for rendering this kind of dataset include direct mapping of channels to red, green, and blue. It is possible to generate false color renderings by creating color lookup tables depending on one or more bands from the image. Derived products can also be created from these datasets that are complicated functions of the input bands. The most simple kind of derived quantity will be a localized function operating on individual pixels. More complicated postprocessing involving non-local fuctions would best be handled on the server.
Geophyscial simulations typically will output multiple arrays of data (variables) on a well defined grid in multiple time steps. Server side these are often stored in a single, self describing binary file (NetCDF, HDF, GRIB, etc.). The data can represent scalar quantities, fluxes, and vector fields in two or three dimensions. There are huge numbers of options for rendering these datasets. In many cases, it is appropriate to generate vector features server side. For very large datasets, it will necessary to do regridding to display scale for all but the most simple use cases.
For Minerva, it will be important to set a distinction between what is considered data processing and what is a rendering option. Data processing will probably result in the creation of a new dataset, while rendering options act on a dataset to produce a specific visual effect.
Because we wish to persist style information, the methods used to generate them should be serializable somehow. We will define a style generator as a function that maps a feature property into style property. Different types of style generators exist for each type of input and output, and each type requires different metadata to serialize itself.
A continuous property type will be defined as a set of values that can
be interpolated continuously. For example, numeric values 0
and 1
could
be interpolated to 0.5
. Specifically, he set of all possible values should
form a convex metric space
where it is possible to add, subtract, and measure the distance between values.
A categorical property is generally a set of values for which there is no meaningful interpretation of "interpolation". The set can be ordered such as an ordinal ranking or unordered such as a place name.
Data properties are the metadata associated with a geometry in a vector dataset or a pixel in a gridded dataset. The data type of these properties determine the domain of the generators that act on them.
A numeric property could take many forms such as population counts, areas, temperatures, or ordinal rankings or categories. A numeric property may be continuous or categorical.
A property value provided as a string is often categorical, but it could also be a string representation of a different type such as a color or date. This provides a challenge for type introspection.
Dates and times are usually provided as strings, but represent continuous values. Dates and times often require special handling for labeling, binning, and parsing.
Multi dimensional numeric types could come in the form of velocities or colors. These share many of the same properties of continuous numeric types, but lack a well defined total order.
The following data types are provided as values to style properties that exist on datasets. These properties are given per "marker" i.e. for each point in a point layer or each polygon in a polygon layer. These are the range of the style generator functions.
- color - A single color value given as a string (i.e.
'#012fed'
) - opacity - An opacity given as a number between 0 and 1
- boolean - Indicate presence or absence of a component
- width - Indicate the width of a component in pixels as a positive number
- enumeration - A choice between one or more prepopulated values
A generator is a serializable function similar to a d3 scale. Generators come in several different forms each taking varying numbers of attributes that determine the function that they compute.
All generators will contain the following attributes that are used by general utility methods.
This is the key used as the source for the generator in the data property object. For example, given GeoJSON features like this:
{
"type": "Polygon",
"geometry": {},
"properties": {
"population": 10000
}
}
A source attribute of population
would scale the style by the
population. Different source data types (like gridded data)
might use this property differently. For example, it could
represent a variable name in a multivariable dataset or band
in a multispectral image.
The target style attribute that the generator produces. This could
be any value that is valid for the feature type. For example, a value
of fillColor
would cause the generator to set the fill color of a
polygon feature.
A value of the output type of the generator that is used by default when no other rules apply.
These generators map a continuous domain into a continuous range. The domain should be totally ordered. Possible type signature include the following:
domain:
- numeric
- date/time
range:
- opacity
- color
- width
A string describing the interpolation method between values. Defaults
to linear
that will create a piecewise linear generator. Other possible
values include power
and log
. Other types may add optional attributes
to the schema; for example, log
could add a base
attribute that sets
the base of the interpolating logarithm.
An array of two or more values from the domain of the generator. The values should be unique and ordered from minimum to maximum. For the typical use case, this will be a two element array containing the minimum and maximum value of the domain. Providing additional values will create a piecewise interpolating function.
An array of values from the range with a number of elements equal to the
length of the domain
attribute. These values are the range in the
interpolating function.
A boolean value indicating whether values passed to the generator should be
clamping to the extent of the domain (default true
). Passing false
may result in invalid values returned by the generator for certain transforms.
Categorical generators directly map values from the domain to the range without interpolation.
domain:
- string
- number (ordinal)
range:
- number
- color
- boolean
- enumeration
An array of one or more value from the domain of the generator. The position of an element in this array is mapped to the value in the range array in the same position.
An array of values from the range of the generator with length equal to the domain array.
Generating discrete style attributes from continuous data can be viewed as a preprocessing step before executing a categorical generator. As such, the generator type derives from the categorical generator, adding an attribute that describes how to break the domain into categories.
This is an array of two or more values from the domain of the generator providing a series of "bins" defining the categorical partitioning. The first element of this array is the minimum value of the smallest category; the second value is the maximum value (not inclusive) of the smallest category and the minimum value of the next category.
This is an array of elements in the range of the generator of length one less than the length of the domain array. This provides the returned value of the corresponding category.