Skip to content

Instantly share code, notes, and snippets.

@meggart
Last active October 7, 2019 10:09
Show Gist options
  • Save meggart/e29e6381d9400ff789eefbccc109d6f9 to your computer and use it in GitHub Desktop.
Save meggart/e29e6381d9400ff789eefbccc109d6f9 to your computer and use it in GitHub Desktop.
DimensionalData interface proposal.md
using DimensionalData
using GeoData
# Functions to be extended
"""
    gridcoordinates(::Gridtype,x)

Returns an iterator over the coordinates of a given grid with the same shape as `x`
"""
gridcoordinates(x) = gridcoordinates(hasgrid(x),x)

"""
    gridcoordinate(::Gridtype,x,i...)

Returns the coordinate associated with the value of `x[i...]`
"""
gridcoordinate(x,i...) = gridcoordinate(hasgrid(x),x,i...)

"""
    gridbounds(::Gridtype,x)

Returns the boundaries of every grid cell of `x`. The returned value will have the size `size(x) .+ 1`  
"""
gridbounds(x) = gridbounds(hasgrid(x),x)

"""
    gridbound(::Gridtype,x,i...)

For the value at index `i` returns the upper and lower bound of the grid cell for each dimension. 
"""
gridbound(x,i...) = gridbound(hasgrid(x),x,i...)

"""
    dimnames(::Gridtype, x)

Returns the names of the dimensions of x. 
"""
dimnames(x) = dimnames(hasgrid(x),x)

"""
    dimname(::Gridtype, x, i)

Returns the name of the ith dimension of x. 
"""
dimname(x,i) = dimname(hasgrid(x),x,i)
dimname
abstract type RegularGrid{T} end

"""
    RegularProductGrid

Trait describing a regular grid along all axes. 
To implement the trait, a dims() function is required, returning a list of 
dimension objects for each axis, which provide a `name` and a `vals` method.
"""
struct RegularProductGrid{T} <: RegularGrid{T} end

"""
    Center

To be used as a type parameter to described grids that are defined by their center coordinates.  
"""
struct Center end

# Implement the trait functions for regular grids where center coordinates are provided
gridcoordinates(::RegularGrid{Center},x) = Iterators.product(map(val,dims(x))...)
gridcoordinate(::RegularGrid{Center},x,i...) = map((d,j)->val(d)[j],dims(x),i)

gridbounds(::RegularGrid{Center},x) = Iterators.product(map(i->range(first(i)-step(i)/2,last(i)+step(i)/2,length=length(i)+1),val.(dims(x)))...)
gridbound(::RegularGrid{Center},x,i...) = map((d,j)->begin v = val(d);(v[j]-step(v)/2,v[j]+step(v)/2) end,dims(x),i);

dimnames(::RegularGrid{Center},x) = map(DimensionalData.name,dims(x))
dimname(::RegularGrid{Center},x,i) = DimensionalData.name(dims(x,i))
dimname (generic function with 2 methods)
# Implemetation for DimensionalData is very simple
hasgrid(::DimensionalArray) = RegularProductGrid{Center}()
hasgrid (generic function with 1 method)
#Fallback for regular arrays
struct UnknownGrid <: RegularGrid{Center} end
#One might think about using Base.axes here to support things like OffsetArrays etc...
hasgrid(::AbstractArray) = UnknownGrid()

struct UnknownDimension{I}
    length::Int
end
DimensionalData.val(x::UnknownDimension) = range(0.5,x.length-0.5,length=x.length)
DimensionalData.name(x::UnknownDimension{I}) where I = string("x",I)
DimensionalData.dims(x::AbstractArray) = ntuple(i->UnknownDimension{i}(size(x,i)),ndims(x))
# Test this with a DimensionalData array
arrayreg = DimensionalArray(rand(20,10),(Lon(20:29),Lat(49:-1:40)));
gridcoordinates(arrayreg) |> collect
20×10 Array{Tuple{Float64,Int64},2}:
 (20.0, 49)     (20.0, 48)     (20.0, 47)     …  (20.0, 41)     (20.0, 40)   
 (20.4737, 49)  (20.4737, 48)  (20.4737, 47)     (20.4737, 41)  (20.4737, 40)
 (20.9474, 49)  (20.9474, 48)  (20.9474, 47)     (20.9474, 41)  (20.9474, 40)
 (21.4211, 49)  (21.4211, 48)  (21.4211, 47)     (21.4211, 41)  (21.4211, 40)
 (21.8947, 49)  (21.8947, 48)  (21.8947, 47)     (21.8947, 41)  (21.8947, 40)
 (22.3684, 49)  (22.3684, 48)  (22.3684, 47)  …  (22.3684, 41)  (22.3684, 40)
 (22.8421, 49)  (22.8421, 48)  (22.8421, 47)     (22.8421, 41)  (22.8421, 40)
 (23.3158, 49)  (23.3158, 48)  (23.3158, 47)     (23.3158, 41)  (23.3158, 40)
 (23.7895, 49)  (23.7895, 48)  (23.7895, 47)     (23.7895, 41)  (23.7895, 40)
 (24.2632, 49)  (24.2632, 48)  (24.2632, 47)     (24.2632, 41)  (24.2632, 40)
 (24.7368, 49)  (24.7368, 48)  (24.7368, 47)  …  (24.7368, 41)  (24.7368, 40)
 (25.2105, 49)  (25.2105, 48)  (25.2105, 47)     (25.2105, 41)  (25.2105, 40)
 (25.6842, 49)  (25.6842, 48)  (25.6842, 47)     (25.6842, 41)  (25.6842, 40)
 (26.1579, 49)  (26.1579, 48)  (26.1579, 47)     (26.1579, 41)  (26.1579, 40)
 (26.6316, 49)  (26.6316, 48)  (26.6316, 47)     (26.6316, 41)  (26.6316, 40)
 (27.1053, 49)  (27.1053, 48)  (27.1053, 47)  …  (27.1053, 41)  (27.1053, 40)
 (27.5789, 49)  (27.5789, 48)  (27.5789, 47)     (27.5789, 41)  (27.5789, 40)
 (28.0526, 49)  (28.0526, 48)  (28.0526, 47)     (28.0526, 41)  (28.0526, 40)
 (28.5263, 49)  (28.5263, 48)  (28.5263, 47)     (28.5263, 41)  (28.5263, 40)
 (29.0, 49)     (29.0, 48)     (29.0, 47)        (29.0, 41)     (29.0, 40)   
dimname(arrayreg,1)
"Longitude"
gridcoordinate(arrayreg,1,1)
(20.0, 49)
# We can access the boundaries of each grid cell as well
gridbounds(arrayreg) |> collect
21×11 Array{Tuple{Float64,Float64},2}:
 (19.7632, 49.5)  (19.7632, 48.5)  …  (19.7632, 40.5)  (19.7632, 39.5)
 (20.2368, 49.5)  (20.2368, 48.5)     (20.2368, 40.5)  (20.2368, 39.5)
 (20.7105, 49.5)  (20.7105, 48.5)     (20.7105, 40.5)  (20.7105, 39.5)
 (21.1842, 49.5)  (21.1842, 48.5)     (21.1842, 40.5)  (21.1842, 39.5)
 (21.6579, 49.5)  (21.6579, 48.5)     (21.6579, 40.5)  (21.6579, 39.5)
 (22.1316, 49.5)  (22.1316, 48.5)  …  (22.1316, 40.5)  (22.1316, 39.5)
 (22.6053, 49.5)  (22.6053, 48.5)     (22.6053, 40.5)  (22.6053, 39.5)
 (23.0789, 49.5)  (23.0789, 48.5)     (23.0789, 40.5)  (23.0789, 39.5)
 (23.5526, 49.5)  (23.5526, 48.5)     (23.5526, 40.5)  (23.5526, 39.5)
 (24.0263, 49.5)  (24.0263, 48.5)     (24.0263, 40.5)  (24.0263, 39.5)
 (24.5, 49.5)     (24.5, 48.5)     …  (24.5, 40.5)     (24.5, 39.5)   
 (24.9737, 49.5)  (24.9737, 48.5)     (24.9737, 40.5)  (24.9737, 39.5)
 (25.4474, 49.5)  (25.4474, 48.5)     (25.4474, 40.5)  (25.4474, 39.5)
 (25.9211, 49.5)  (25.9211, 48.5)     (25.9211, 40.5)  (25.9211, 39.5)
 (26.3947, 49.5)  (26.3947, 48.5)     (26.3947, 40.5)  (26.3947, 39.5)
 (26.8684, 49.5)  (26.8684, 48.5)  …  (26.8684, 40.5)  (26.8684, 39.5)
 (27.3421, 49.5)  (27.3421, 48.5)     (27.3421, 40.5)  (27.3421, 39.5)
 (27.8158, 49.5)  (27.8158, 48.5)     (27.8158, 40.5)  (27.8158, 39.5)
 (28.2895, 49.5)  (28.2895, 48.5)     (28.2895, 40.5)  (28.2895, 39.5)
 (28.7632, 49.5)  (28.7632, 48.5)     (28.7632, 40.5)  (28.7632, 39.5)
 (29.2368, 49.5)  (29.2368, 48.5)  …  (29.2368, 40.5)  (29.2368, 39.5)
gridbound(arrayreg,1,1)
((19.763157894736842, 20.236842105263158), (49.5, 48.5))
# Adn test the fallback for a regular array
gridcoordinates(rand(10,10)) |> collect
10×10 Array{Tuple{Float64,Float64},2}:
 (0.5, 0.5)  (0.5, 1.5)  (0.5, 2.5)  …  (0.5, 7.5)  (0.5, 8.5)  (0.5, 9.5)
 (1.5, 0.5)  (1.5, 1.5)  (1.5, 2.5)     (1.5, 7.5)  (1.5, 8.5)  (1.5, 9.5)
 (2.5, 0.5)  (2.5, 1.5)  (2.5, 2.5)     (2.5, 7.5)  (2.5, 8.5)  (2.5, 9.5)
 (3.5, 0.5)  (3.5, 1.5)  (3.5, 2.5)     (3.5, 7.5)  (3.5, 8.5)  (3.5, 9.5)
 (4.5, 0.5)  (4.5, 1.5)  (4.5, 2.5)     (4.5, 7.5)  (4.5, 8.5)  (4.5, 9.5)
 (5.5, 0.5)  (5.5, 1.5)  (5.5, 2.5)  …  (5.5, 7.5)  (5.5, 8.5)  (5.5, 9.5)
 (6.5, 0.5)  (6.5, 1.5)  (6.5, 2.5)     (6.5, 7.5)  (6.5, 8.5)  (6.5, 9.5)
 (7.5, 0.5)  (7.5, 1.5)  (7.5, 2.5)     (7.5, 7.5)  (7.5, 8.5)  (7.5, 9.5)
 (8.5, 0.5)  (8.5, 1.5)  (8.5, 2.5)     (8.5, 7.5)  (8.5, 8.5)  (8.5, 9.5)
 (9.5, 0.5)  (9.5, 1.5)  (9.5, 2.5)     (9.5, 7.5)  (9.5, 8.5)  (9.5, 9.5)
# Now an example for moving grids or grids that are not a product of some axes

"""
    IrregularGrid


Traits describing a grid where the coordinates of one or more axes change along another axis. 
Subtypes of this should directly implement the functions `gridcoordinates` as well as `gridbounds`
"""
abstract type IrregularGrid{T} end
IrregularGrid
#Example implementation of the trait for NCDatasets
using NCDatasets

# This should actually look into the dataset and find out what is defined there and in particular it might
# respect lon_bnds or time_bnds to rather define a grid through its bounds if necessary
struct GenericNCDatasetGrid <: IrregularGrid{Center} end
hasgrid(::NCDatasets.CFVariable) = GenericNCDatasetGrid()
gridcoordinates(::GenericNCDatasetGrid,x) = zip(NCDatasets.coord(v,"longitude"),NCDatasets.coord(v,"latitude"))
WARNING: using NCDatasets.dimnames in module Main conflicts with an existing identifier.





gridcoordinates (generic function with 3 methods)
ds = Dataset("/home/fgans/Downloads/ocean_his.nc")
v = ds["sustr"]
gridcoordinates(v) |> collect
5×6 Array{Tuple{Union{Missing, Dates.DateTime, AbstractCFDateTime, Number},Union{Missing, Dates.DateTime, AbstractCFDateTime, Number}},2}:
 (-73.1577, 38.5028)  (-73.2269, 38.5619)  …  (-73.5038, 38.7977)
 (-73.0789, 38.5592)  (-73.1482, 38.6183)     (-73.4251, 38.8539)
 (-73.0002, 38.6157)  (-73.0694, 38.6747)     (-73.3463, 38.9101)
 (-72.9214, 38.6721)  (-72.9907, 38.731)      (-73.2676, 38.9663)
 (-72.8427, 38.7284)  (-72.9119, 38.7873)     (-73.1888, 39.0224)
close(ds)
@meggart
Copy link
Author

meggart commented Oct 7, 2019

Thanks @rafaqz for replying. Yes, I think plottingwould be one target area, I would be nice to have plotting functions for different raster types. For me personally the motivation would be to make the processing tools in https://github.com/esa-esdl/ESDL.jl usable to a wider set of data, which is mainly a quite efficient mapslices and broadcast_slices on chunked out-of-core data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment