Skip to content

Instantly share code, notes, and snippets.

@meggart
Last active October 7, 2019 10:09
Show Gist options
  • Save meggart/e29e6381d9400ff789eefbccc109d6f9 to your computer and use it in GitHub Desktop.
Save meggart/e29e6381d9400ff789eefbccc109d6f9 to your computer and use it in GitHub Desktop.
DimensionalData interface proposal.md
using DimensionalData
using GeoData
# Functions to be extended
"""
    gridcoordinates(::Gridtype,x)

Returns an iterator over the coordinates of a given grid with the same shape as `x`
"""
gridcoordinates(x) = gridcoordinates(hasgrid(x),x)

"""
    gridcoordinate(::Gridtype,x,i...)

Returns the coordinate associated with the value of `x[i...]`
"""
gridcoordinate(x,i...) = gridcoordinate(hasgrid(x),x,i...)

"""
    gridbounds(::Gridtype,x)

Returns the boundaries of every grid cell of `x`. The returned value will have the size `size(x) .+ 1`  
"""
gridbounds(x) = gridbounds(hasgrid(x),x)

"""
    gridbound(::Gridtype,x,i...)

For the value at index `i` returns the upper and lower bound of the grid cell for each dimension. 
"""
gridbound(x,i...) = gridbound(hasgrid(x),x,i...)

"""
    dimnames(::Gridtype, x)

Returns the names of the dimensions of x. 
"""
dimnames(x) = dimnames(hasgrid(x),x)

"""
    dimname(::Gridtype, x, i)

Returns the name of the ith dimension of x. 
"""
dimname(x,i) = dimname(hasgrid(x),x,i)
dimname
abstract type RegularGrid{T} end

"""
    RegularProductGrid

Trait describing a regular grid along all axes. 
To implement the trait, a dims() function is required, returning a list of 
dimension objects for each axis, which provide a `name` and a `vals` method.
"""
struct RegularProductGrid{T} <: RegularGrid{T} end

"""
    Center

To be used as a type parameter to described grids that are defined by their center coordinates.  
"""
struct Center end

# Implement the trait functions for regular grids where center coordinates are provided
gridcoordinates(::RegularGrid{Center},x) = Iterators.product(map(val,dims(x))...)
gridcoordinate(::RegularGrid{Center},x,i...) = map((d,j)->val(d)[j],dims(x),i)

gridbounds(::RegularGrid{Center},x) = Iterators.product(map(i->range(first(i)-step(i)/2,last(i)+step(i)/2,length=length(i)+1),val.(dims(x)))...)
gridbound(::RegularGrid{Center},x,i...) = map((d,j)->begin v = val(d);(v[j]-step(v)/2,v[j]+step(v)/2) end,dims(x),i);

dimnames(::RegularGrid{Center},x) = map(DimensionalData.name,dims(x))
dimname(::RegularGrid{Center},x,i) = DimensionalData.name(dims(x,i))
dimname (generic function with 2 methods)
# Implemetation for DimensionalData is very simple
hasgrid(::DimensionalArray) = RegularProductGrid{Center}()
hasgrid (generic function with 1 method)
#Fallback for regular arrays
struct UnknownGrid <: RegularGrid{Center} end
#One might think about using Base.axes here to support things like OffsetArrays etc...
hasgrid(::AbstractArray) = UnknownGrid()

struct UnknownDimension{I}
    length::Int
end
DimensionalData.val(x::UnknownDimension) = range(0.5,x.length-0.5,length=x.length)
DimensionalData.name(x::UnknownDimension{I}) where I = string("x",I)
DimensionalData.dims(x::AbstractArray) = ntuple(i->UnknownDimension{i}(size(x,i)),ndims(x))
# Test this with a DimensionalData array
arrayreg = DimensionalArray(rand(20,10),(Lon(20:29),Lat(49:-1:40)));
gridcoordinates(arrayreg) |> collect
20×10 Array{Tuple{Float64,Int64},2}:
 (20.0, 49)     (20.0, 48)     (20.0, 47)     …  (20.0, 41)     (20.0, 40)   
 (20.4737, 49)  (20.4737, 48)  (20.4737, 47)     (20.4737, 41)  (20.4737, 40)
 (20.9474, 49)  (20.9474, 48)  (20.9474, 47)     (20.9474, 41)  (20.9474, 40)
 (21.4211, 49)  (21.4211, 48)  (21.4211, 47)     (21.4211, 41)  (21.4211, 40)
 (21.8947, 49)  (21.8947, 48)  (21.8947, 47)     (21.8947, 41)  (21.8947, 40)
 (22.3684, 49)  (22.3684, 48)  (22.3684, 47)  …  (22.3684, 41)  (22.3684, 40)
 (22.8421, 49)  (22.8421, 48)  (22.8421, 47)     (22.8421, 41)  (22.8421, 40)
 (23.3158, 49)  (23.3158, 48)  (23.3158, 47)     (23.3158, 41)  (23.3158, 40)
 (23.7895, 49)  (23.7895, 48)  (23.7895, 47)     (23.7895, 41)  (23.7895, 40)
 (24.2632, 49)  (24.2632, 48)  (24.2632, 47)     (24.2632, 41)  (24.2632, 40)
 (24.7368, 49)  (24.7368, 48)  (24.7368, 47)  …  (24.7368, 41)  (24.7368, 40)
 (25.2105, 49)  (25.2105, 48)  (25.2105, 47)     (25.2105, 41)  (25.2105, 40)
 (25.6842, 49)  (25.6842, 48)  (25.6842, 47)     (25.6842, 41)  (25.6842, 40)
 (26.1579, 49)  (26.1579, 48)  (26.1579, 47)     (26.1579, 41)  (26.1579, 40)
 (26.6316, 49)  (26.6316, 48)  (26.6316, 47)     (26.6316, 41)  (26.6316, 40)
 (27.1053, 49)  (27.1053, 48)  (27.1053, 47)  …  (27.1053, 41)  (27.1053, 40)
 (27.5789, 49)  (27.5789, 48)  (27.5789, 47)     (27.5789, 41)  (27.5789, 40)
 (28.0526, 49)  (28.0526, 48)  (28.0526, 47)     (28.0526, 41)  (28.0526, 40)
 (28.5263, 49)  (28.5263, 48)  (28.5263, 47)     (28.5263, 41)  (28.5263, 40)
 (29.0, 49)     (29.0, 48)     (29.0, 47)        (29.0, 41)     (29.0, 40)   
dimname(arrayreg,1)
"Longitude"
gridcoordinate(arrayreg,1,1)
(20.0, 49)
# We can access the boundaries of each grid cell as well
gridbounds(arrayreg) |> collect
21×11 Array{Tuple{Float64,Float64},2}:
 (19.7632, 49.5)  (19.7632, 48.5)  …  (19.7632, 40.5)  (19.7632, 39.5)
 (20.2368, 49.5)  (20.2368, 48.5)     (20.2368, 40.5)  (20.2368, 39.5)
 (20.7105, 49.5)  (20.7105, 48.5)     (20.7105, 40.5)  (20.7105, 39.5)
 (21.1842, 49.5)  (21.1842, 48.5)     (21.1842, 40.5)  (21.1842, 39.5)
 (21.6579, 49.5)  (21.6579, 48.5)     (21.6579, 40.5)  (21.6579, 39.5)
 (22.1316, 49.5)  (22.1316, 48.5)  …  (22.1316, 40.5)  (22.1316, 39.5)
 (22.6053, 49.5)  (22.6053, 48.5)     (22.6053, 40.5)  (22.6053, 39.5)
 (23.0789, 49.5)  (23.0789, 48.5)     (23.0789, 40.5)  (23.0789, 39.5)
 (23.5526, 49.5)  (23.5526, 48.5)     (23.5526, 40.5)  (23.5526, 39.5)
 (24.0263, 49.5)  (24.0263, 48.5)     (24.0263, 40.5)  (24.0263, 39.5)
 (24.5, 49.5)     (24.5, 48.5)     …  (24.5, 40.5)     (24.5, 39.5)   
 (24.9737, 49.5)  (24.9737, 48.5)     (24.9737, 40.5)  (24.9737, 39.5)
 (25.4474, 49.5)  (25.4474, 48.5)     (25.4474, 40.5)  (25.4474, 39.5)
 (25.9211, 49.5)  (25.9211, 48.5)     (25.9211, 40.5)  (25.9211, 39.5)
 (26.3947, 49.5)  (26.3947, 48.5)     (26.3947, 40.5)  (26.3947, 39.5)
 (26.8684, 49.5)  (26.8684, 48.5)  …  (26.8684, 40.5)  (26.8684, 39.5)
 (27.3421, 49.5)  (27.3421, 48.5)     (27.3421, 40.5)  (27.3421, 39.5)
 (27.8158, 49.5)  (27.8158, 48.5)     (27.8158, 40.5)  (27.8158, 39.5)
 (28.2895, 49.5)  (28.2895, 48.5)     (28.2895, 40.5)  (28.2895, 39.5)
 (28.7632, 49.5)  (28.7632, 48.5)     (28.7632, 40.5)  (28.7632, 39.5)
 (29.2368, 49.5)  (29.2368, 48.5)  …  (29.2368, 40.5)  (29.2368, 39.5)
gridbound(arrayreg,1,1)
((19.763157894736842, 20.236842105263158), (49.5, 48.5))
# Adn test the fallback for a regular array
gridcoordinates(rand(10,10)) |> collect
10×10 Array{Tuple{Float64,Float64},2}:
 (0.5, 0.5)  (0.5, 1.5)  (0.5, 2.5)  …  (0.5, 7.5)  (0.5, 8.5)  (0.5, 9.5)
 (1.5, 0.5)  (1.5, 1.5)  (1.5, 2.5)     (1.5, 7.5)  (1.5, 8.5)  (1.5, 9.5)
 (2.5, 0.5)  (2.5, 1.5)  (2.5, 2.5)     (2.5, 7.5)  (2.5, 8.5)  (2.5, 9.5)
 (3.5, 0.5)  (3.5, 1.5)  (3.5, 2.5)     (3.5, 7.5)  (3.5, 8.5)  (3.5, 9.5)
 (4.5, 0.5)  (4.5, 1.5)  (4.5, 2.5)     (4.5, 7.5)  (4.5, 8.5)  (4.5, 9.5)
 (5.5, 0.5)  (5.5, 1.5)  (5.5, 2.5)  …  (5.5, 7.5)  (5.5, 8.5)  (5.5, 9.5)
 (6.5, 0.5)  (6.5, 1.5)  (6.5, 2.5)     (6.5, 7.5)  (6.5, 8.5)  (6.5, 9.5)
 (7.5, 0.5)  (7.5, 1.5)  (7.5, 2.5)     (7.5, 7.5)  (7.5, 8.5)  (7.5, 9.5)
 (8.5, 0.5)  (8.5, 1.5)  (8.5, 2.5)     (8.5, 7.5)  (8.5, 8.5)  (8.5, 9.5)
 (9.5, 0.5)  (9.5, 1.5)  (9.5, 2.5)     (9.5, 7.5)  (9.5, 8.5)  (9.5, 9.5)
# Now an example for moving grids or grids that are not a product of some axes

"""
    IrregularGrid


Traits describing a grid where the coordinates of one or more axes change along another axis. 
Subtypes of this should directly implement the functions `gridcoordinates` as well as `gridbounds`
"""
abstract type IrregularGrid{T} end
IrregularGrid
#Example implementation of the trait for NCDatasets
using NCDatasets

# This should actually look into the dataset and find out what is defined there and in particular it might
# respect lon_bnds or time_bnds to rather define a grid through its bounds if necessary
struct GenericNCDatasetGrid <: IrregularGrid{Center} end
hasgrid(::NCDatasets.CFVariable) = GenericNCDatasetGrid()
gridcoordinates(::GenericNCDatasetGrid,x) = zip(NCDatasets.coord(v,"longitude"),NCDatasets.coord(v,"latitude"))
WARNING: using NCDatasets.dimnames in module Main conflicts with an existing identifier.





gridcoordinates (generic function with 3 methods)
ds = Dataset("/home/fgans/Downloads/ocean_his.nc")
v = ds["sustr"]
gridcoordinates(v) |> collect
5×6 Array{Tuple{Union{Missing, Dates.DateTime, AbstractCFDateTime, Number},Union{Missing, Dates.DateTime, AbstractCFDateTime, Number}},2}:
 (-73.1577, 38.5028)  (-73.2269, 38.5619)  …  (-73.5038, 38.7977)
 (-73.0789, 38.5592)  (-73.1482, 38.6183)     (-73.4251, 38.8539)
 (-73.0002, 38.6157)  (-73.0694, 38.6747)     (-73.3463, 38.9101)
 (-72.9214, 38.6721)  (-72.9907, 38.731)      (-73.2676, 38.9663)
 (-72.8427, 38.7284)  (-72.9119, 38.7873)     (-73.1888, 39.0224)
close(ds)
@yeesian
Copy link

yeesian commented Oct 6, 2019

That's cool, thanks for writing it up!

I've just gone through the announcement of GeoData.jl and some of its commits: I don't have the bandwidth (and relevant knowledge) to participate in the design of an interface for rasters, but I'll be happy to implement it for ArchGDAL (to remove the dependency of the interface on GDAL) once you all have figured it out!

@meggart
Copy link
Author

meggart commented Oct 7, 2019

Thanks @rafaqz for replying. Yes, I think plottingwould be one target area, I would be nice to have plotting functions for different raster types. For me personally the motivation would be to make the processing tools in https://github.com/esa-esdl/ESDL.jl usable to a wider set of data, which is mainly a quite efficient mapslices and broadcast_slices on chunked out-of-core data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment