Skip to content

Instantly share code, notes, and snippets.

@blaylockbk
Created October 8, 2021 22:47
Show Gist options
  • Save blaylockbk/40211cbbac1f9cb0a9eb399e7770ac52 to your computer and use it in GitHub Desktop.
Save blaylockbk/40211cbbac1f9cb0a9eb399e7770ac52 to your computer and use it in GitHub Desktop.
"""
Brian Blaylock
Most recent function is located here:
https://github.com/blaylockbk/Herbie/blob/master/herbie/tools.py
This function is much faster than my old `pluck_points` function found here:
https://github.com/blaylockbk/Carpenter_Workshop/blob/main/toolbox/gridded_data.py
"""
def nearest_points(ds, points, names=None, verbose=True):
"""
Get the nearest latitude/longitude points from a xarray Dataset.
This is **much** faster than my old "pluck_points" method. For
matchign 1,948 points,
- `nearest_points` completed in 7.5 seconds.
- `pluck_points` completed in 2 minutes.
Info
----
- Stack Overflow: https://stackoverflow.com/questions/58758480/xarray-select-nearest-lat-lon-with-multi-dimension-coordinates
- MetPy Details: https://unidata.github.io/MetPy/latest/tutorials/xarray_tutorial.html?highlight=assign_y_x
Parameters
----------
ds : a friendly xarray Dataset
points : tuple (lon, lat) or list of tuples
The longitude and latitude (lon, lat) coordinate pair (as a tuple)
for the points you want to pluck from the gridded Dataset.
A list of tuples may be given to return the values from multiple points.
names : list
A list of names for each point location (i.e., station name).
None will not append any names. names should be the same
length as points.
"""
# Check if MetPy has already parsed the CF metadata grid projection.
# Do that if it hasn't been done yet.
if 'metpy_crs' not in ds:
ds = ds.metpy.parse_cf()
# Apply the MetPy method `assign_y_x` to the dataset
# https://unidata.github.io/MetPy/latest/api/generated/metpy.xarray.html?highlight=assign_y_x#metpy.xarray.MetPyDataArrayAccessor.assign_y_x
ds = ds.metpy.assign_y_x()
# Convert the requested [(lon,lat), (lon,lat)] points to map projection.
# Accept a list of point tuples, or Shapely Points object.
# We want to index the dataset at a single point.
# We can do this by transforming a lat/lon point to the grid location
crs = ds.metpy_crs.item().to_cartopy()
# lat/lon input must be a numpy array, not a list or polygon
if isinstance(points, tuple):
# If a tuple is give, turn into a one-item list.
points = np.array([points])
if not isinstance(points, np.ndarray):
# Points must be a 2D numpy array
points = np.array(points)
lons = points[:,0]
lats = points[:,1]
transformed_data = crs.transform_points(ccrs.PlateCarree(), lons, lats)
xs = transformed_data[:,0]
ys = transformed_data[:,1]
# Select the nearest points from the projection coordinates.
# TODO: Is there a better way?
# There doesn't seem to be a way to get just the points like this
#ds = ds.sel(x=xs, y=ys, method='nearest')
# because it gives a 2D array, and not a point-by-point index.
# Instead, I have too loop the ds.sel method
new_ds = xr.concat([ds.sel(x=xi, y=yi, method='nearest') for xi, yi in zip(xs, ys)], dim='point')
# Add list of names as a coordinate
if names is not None:
# Assign the point dimension as the names.
assert len(points) == len(names), '`points` and `names` must be same length.'
new_ds['point'] = names
return new_ds
@blaylockbk
Copy link
Author

The nearest_points function scales much better than my old pluck_points function. Use nearest_points when you are matching 10 or more points.

image

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment