{{ message }}

Instantly share code, notes, and snippets.

# meggart/gist:e29e6381d9400ff789eefbccc109d6f9

Last active Oct 7, 2019
DimensionalData interface proposal.md
```using DimensionalData
using GeoData```
```# Functions to be extended
"""
gridcoordinates(::Gridtype,x)

Returns an iterator over the coordinates of a given grid with the same shape as `x`
"""
gridcoordinates(x) = gridcoordinates(hasgrid(x),x)

"""
gridcoordinate(::Gridtype,x,i...)

Returns the coordinate associated with the value of `x[i...]`
"""
gridcoordinate(x,i...) = gridcoordinate(hasgrid(x),x,i...)

"""
gridbounds(::Gridtype,x)

Returns the boundaries of every grid cell of `x`. The returned value will have the size `size(x) .+ 1`
"""
gridbounds(x) = gridbounds(hasgrid(x),x)

"""
gridbound(::Gridtype,x,i...)

For the value at index `i` returns the upper and lower bound of the grid cell for each dimension.
"""
gridbound(x,i...) = gridbound(hasgrid(x),x,i...)

"""
dimnames(::Gridtype, x)

Returns the names of the dimensions of x.
"""
dimnames(x) = dimnames(hasgrid(x),x)

"""
dimname(::Gridtype, x, i)

Returns the name of the ith dimension of x.
"""
dimname(x,i) = dimname(hasgrid(x),x,i)```
``````dimname
``````
```abstract type RegularGrid{T} end

"""
RegularProductGrid

Trait describing a regular grid along all axes.
To implement the trait, a dims() function is required, returning a list of
dimension objects for each axis, which provide a `name` and a `vals` method.
"""
struct RegularProductGrid{T} <: RegularGrid{T} end

"""
Center

To be used as a type parameter to described grids that are defined by their center coordinates.
"""
struct Center end

# Implement the trait functions for regular grids where center coordinates are provided
gridcoordinates(::RegularGrid{Center},x) = Iterators.product(map(val,dims(x))...)
gridcoordinate(::RegularGrid{Center},x,i...) = map((d,j)->val(d)[j],dims(x),i)

gridbounds(::RegularGrid{Center},x) = Iterators.product(map(i->range(first(i)-step(i)/2,last(i)+step(i)/2,length=length(i)+1),val.(dims(x)))...)
gridbound(::RegularGrid{Center},x,i...) = map((d,j)->begin v = val(d);(v[j]-step(v)/2,v[j]+step(v)/2) end,dims(x),i);

dimnames(::RegularGrid{Center},x) = map(DimensionalData.name,dims(x))
dimname(::RegularGrid{Center},x,i) = DimensionalData.name(dims(x,i))```
``````dimname (generic function with 2 methods)
``````
```# Implemetation for DimensionalData is very simple
hasgrid(::DimensionalArray) = RegularProductGrid{Center}()```
``````hasgrid (generic function with 1 method)
``````
```#Fallback for regular arrays
struct UnknownGrid <: RegularGrid{Center} end
#One might think about using Base.axes here to support things like OffsetArrays etc...
hasgrid(::AbstractArray) = UnknownGrid()

struct UnknownDimension{I}
length::Int
end
DimensionalData.val(x::UnknownDimension) = range(0.5,x.length-0.5,length=x.length)
DimensionalData.name(x::UnknownDimension{I}) where I = string("x",I)
DimensionalData.dims(x::AbstractArray) = ntuple(i->UnknownDimension{i}(size(x,i)),ndims(x))```
```# Test this with a DimensionalData array
arrayreg = DimensionalArray(rand(20,10),(Lon(20:29),Lat(49:-1:40)));
gridcoordinates(arrayreg) |> collect```
``````20×10 Array{Tuple{Float64,Int64},2}:
(20.0, 49)     (20.0, 48)     (20.0, 47)     …  (20.0, 41)     (20.0, 40)
(20.4737, 49)  (20.4737, 48)  (20.4737, 47)     (20.4737, 41)  (20.4737, 40)
(20.9474, 49)  (20.9474, 48)  (20.9474, 47)     (20.9474, 41)  (20.9474, 40)
(21.4211, 49)  (21.4211, 48)  (21.4211, 47)     (21.4211, 41)  (21.4211, 40)
(21.8947, 49)  (21.8947, 48)  (21.8947, 47)     (21.8947, 41)  (21.8947, 40)
(22.3684, 49)  (22.3684, 48)  (22.3684, 47)  …  (22.3684, 41)  (22.3684, 40)
(22.8421, 49)  (22.8421, 48)  (22.8421, 47)     (22.8421, 41)  (22.8421, 40)
(23.3158, 49)  (23.3158, 48)  (23.3158, 47)     (23.3158, 41)  (23.3158, 40)
(23.7895, 49)  (23.7895, 48)  (23.7895, 47)     (23.7895, 41)  (23.7895, 40)
(24.2632, 49)  (24.2632, 48)  (24.2632, 47)     (24.2632, 41)  (24.2632, 40)
(24.7368, 49)  (24.7368, 48)  (24.7368, 47)  …  (24.7368, 41)  (24.7368, 40)
(25.2105, 49)  (25.2105, 48)  (25.2105, 47)     (25.2105, 41)  (25.2105, 40)
(25.6842, 49)  (25.6842, 48)  (25.6842, 47)     (25.6842, 41)  (25.6842, 40)
(26.1579, 49)  (26.1579, 48)  (26.1579, 47)     (26.1579, 41)  (26.1579, 40)
(26.6316, 49)  (26.6316, 48)  (26.6316, 47)     (26.6316, 41)  (26.6316, 40)
(27.1053, 49)  (27.1053, 48)  (27.1053, 47)  …  (27.1053, 41)  (27.1053, 40)
(27.5789, 49)  (27.5789, 48)  (27.5789, 47)     (27.5789, 41)  (27.5789, 40)
(28.0526, 49)  (28.0526, 48)  (28.0526, 47)     (28.0526, 41)  (28.0526, 40)
(28.5263, 49)  (28.5263, 48)  (28.5263, 47)     (28.5263, 41)  (28.5263, 40)
(29.0, 49)     (29.0, 48)     (29.0, 47)        (29.0, 41)     (29.0, 40)
``````
`dimname(arrayreg,1)`
``````"Longitude"
``````
`gridcoordinate(arrayreg,1,1)`
``````(20.0, 49)
``````
```# We can access the boundaries of each grid cell as well
gridbounds(arrayreg) |> collect```
``````21×11 Array{Tuple{Float64,Float64},2}:
(19.7632, 49.5)  (19.7632, 48.5)  …  (19.7632, 40.5)  (19.7632, 39.5)
(20.2368, 49.5)  (20.2368, 48.5)     (20.2368, 40.5)  (20.2368, 39.5)
(20.7105, 49.5)  (20.7105, 48.5)     (20.7105, 40.5)  (20.7105, 39.5)
(21.1842, 49.5)  (21.1842, 48.5)     (21.1842, 40.5)  (21.1842, 39.5)
(21.6579, 49.5)  (21.6579, 48.5)     (21.6579, 40.5)  (21.6579, 39.5)
(22.1316, 49.5)  (22.1316, 48.5)  …  (22.1316, 40.5)  (22.1316, 39.5)
(22.6053, 49.5)  (22.6053, 48.5)     (22.6053, 40.5)  (22.6053, 39.5)
(23.0789, 49.5)  (23.0789, 48.5)     (23.0789, 40.5)  (23.0789, 39.5)
(23.5526, 49.5)  (23.5526, 48.5)     (23.5526, 40.5)  (23.5526, 39.5)
(24.0263, 49.5)  (24.0263, 48.5)     (24.0263, 40.5)  (24.0263, 39.5)
(24.5, 49.5)     (24.5, 48.5)     …  (24.5, 40.5)     (24.5, 39.5)
(24.9737, 49.5)  (24.9737, 48.5)     (24.9737, 40.5)  (24.9737, 39.5)
(25.4474, 49.5)  (25.4474, 48.5)     (25.4474, 40.5)  (25.4474, 39.5)
(25.9211, 49.5)  (25.9211, 48.5)     (25.9211, 40.5)  (25.9211, 39.5)
(26.3947, 49.5)  (26.3947, 48.5)     (26.3947, 40.5)  (26.3947, 39.5)
(26.8684, 49.5)  (26.8684, 48.5)  …  (26.8684, 40.5)  (26.8684, 39.5)
(27.3421, 49.5)  (27.3421, 48.5)     (27.3421, 40.5)  (27.3421, 39.5)
(27.8158, 49.5)  (27.8158, 48.5)     (27.8158, 40.5)  (27.8158, 39.5)
(28.2895, 49.5)  (28.2895, 48.5)     (28.2895, 40.5)  (28.2895, 39.5)
(28.7632, 49.5)  (28.7632, 48.5)     (28.7632, 40.5)  (28.7632, 39.5)
(29.2368, 49.5)  (29.2368, 48.5)  …  (29.2368, 40.5)  (29.2368, 39.5)
``````
`gridbound(arrayreg,1,1)`
``````((19.763157894736842, 20.236842105263158), (49.5, 48.5))
``````
```# Adn test the fallback for a regular array
gridcoordinates(rand(10,10)) |> collect```
``````10×10 Array{Tuple{Float64,Float64},2}:
(0.5, 0.5)  (0.5, 1.5)  (0.5, 2.5)  …  (0.5, 7.5)  (0.5, 8.5)  (0.5, 9.5)
(1.5, 0.5)  (1.5, 1.5)  (1.5, 2.5)     (1.5, 7.5)  (1.5, 8.5)  (1.5, 9.5)
(2.5, 0.5)  (2.5, 1.5)  (2.5, 2.5)     (2.5, 7.5)  (2.5, 8.5)  (2.5, 9.5)
(3.5, 0.5)  (3.5, 1.5)  (3.5, 2.5)     (3.5, 7.5)  (3.5, 8.5)  (3.5, 9.5)
(4.5, 0.5)  (4.5, 1.5)  (4.5, 2.5)     (4.5, 7.5)  (4.5, 8.5)  (4.5, 9.5)
(5.5, 0.5)  (5.5, 1.5)  (5.5, 2.5)  …  (5.5, 7.5)  (5.5, 8.5)  (5.5, 9.5)
(6.5, 0.5)  (6.5, 1.5)  (6.5, 2.5)     (6.5, 7.5)  (6.5, 8.5)  (6.5, 9.5)
(7.5, 0.5)  (7.5, 1.5)  (7.5, 2.5)     (7.5, 7.5)  (7.5, 8.5)  (7.5, 9.5)
(8.5, 0.5)  (8.5, 1.5)  (8.5, 2.5)     (8.5, 7.5)  (8.5, 8.5)  (8.5, 9.5)
(9.5, 0.5)  (9.5, 1.5)  (9.5, 2.5)     (9.5, 7.5)  (9.5, 8.5)  (9.5, 9.5)
``````
```# Now an example for moving grids or grids that are not a product of some axes

"""
IrregularGrid

Traits describing a grid where the coordinates of one or more axes change along another axis.
Subtypes of this should directly implement the functions `gridcoordinates` as well as `gridbounds`
"""
abstract type IrregularGrid{T} end```
``````IrregularGrid
``````
```#Example implementation of the trait for NCDatasets
using NCDatasets

# This should actually look into the dataset and find out what is defined there and in particular it might
# respect lon_bnds or time_bnds to rather define a grid through its bounds if necessary
struct GenericNCDatasetGrid <: IrregularGrid{Center} end
hasgrid(::NCDatasets.CFVariable) = GenericNCDatasetGrid()
gridcoordinates(::GenericNCDatasetGrid,x) = zip(NCDatasets.coord(v,"longitude"),NCDatasets.coord(v,"latitude"))```
``````WARNING: using NCDatasets.dimnames in module Main conflicts with an existing identifier.

gridcoordinates (generic function with 3 methods)
``````
```ds = Dataset("/home/fgans/Downloads/ocean_his.nc")
v = ds["sustr"]
gridcoordinates(v) |> collect```
``````5×6 Array{Tuple{Union{Missing, Dates.DateTime, AbstractCFDateTime, Number},Union{Missing, Dates.DateTime, AbstractCFDateTime, Number}},2}:
(-73.1577, 38.5028)  (-73.2269, 38.5619)  …  (-73.5038, 38.7977)
(-73.0789, 38.5592)  (-73.1482, 38.6183)     (-73.4251, 38.8539)
(-73.0002, 38.6157)  (-73.0694, 38.6747)     (-73.3463, 38.9101)
(-72.9214, 38.6721)  (-72.9907, 38.731)      (-73.2676, 38.9663)
(-72.8427, 38.7284)  (-72.9119, 38.7873)     (-73.1888, 39.0224)
``````
`close(ds)`

### yeesian commented Oct 6, 2019 • edited

 This is great! I'm not sure if it'll help, but I'll just provide some probing questions: What does the interface provide for people who implement it? (Commonly/possibly interoperability with other packages that does IO/plotting/etc) Is the interface necessary/sufficient for making the functionality in (1) possible? What are the packages that will be able to implement the interface?

### rafaqz commented Oct 6, 2019 • edited

 Some answers (@meggart probably has better specifics for this gist) GeoData.jl provides plotting and some basic interoperable abstractions already (for ArchGDAL too, although a little broken still). It could provide IO and conversion between data types eventually if we can work all of this out. This gist adds the ability to describe the grid type and dispatch on it (Regular/Irregular and more specific types, with details about how cells are measured). Which is one of the main things GeoData.jl doesn't do (at all) - its currently limited to very simple grids. There some contingencies and other requirements, eg. ArchGDAL would need to provide an array interface to completely have 1 working, although some basic indexing already works just using `read` (see the gdal.jl tests in GeoData.jl). Hopefully whoever wants to. I've been writing implementations for ArchGDAL, NCDatasets and a HDF5 format in GeoData.jl, but we need to work out where they will live.

### yeesian commented Oct 6, 2019

 That's cool, thanks for writing it up! I've just gone through the announcement of GeoData.jl and some of its commits: I don't have the bandwidth (and relevant knowledge) to participate in the design of an interface for rasters, but I'll be happy to implement it for ArchGDAL (to remove the dependency of the interface on GDAL) once you all have figured it out!

### meggart commented Oct 7, 2019

 Thanks @rafaqz for replying. Yes, I think plottingwould be one target area, I would be nice to have plotting functions for different raster types. For me personally the motivation would be to make the processing tools in https://github.com/esa-esdl/ESDL.jl usable to a wider set of data, which is mainly a quite efficient mapslices and broadcast_slices on chunked out-of-core data.
to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.