Skip to content

Instantly share code, notes, and snippets.

@bw4sz
Last active October 26, 2020 19:46
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save bw4sz/e2fff9c9df0ae26bd2bfa8953ec4a24c to your computer and use it in GitHub Desktop.
Save bw4sz/e2fff9c9df0ae26bd2bfa8953ec4a24c to your computer and use it in GitHub Desktop.
Utility functions to 1) convert projected shapefiles into annotation format, 2) deepforest predictions to geospatial objects
import geopandas as gp
import rasterio
import os
def shapefile_to_annotations(shapefile, rgb, savedir="."):
"""
Convert a shapefile of annotations into annotations csv file for DeepForest training and evaluation
Args:
shapefile: Path to a shapefile on disk. If a label column is present, it will be used, else all labels are assumed to be "Tree"
rgb: Path to the RGB image on disk
savedir: Directory to save csv files
Returns:
None: a csv file is written
"""
#Read shapefile
gdf = gp.read_file(shapefile)
#get coordinates
df = gdf.geometry.bounds
#raster bounds
with rasterio.open(rgb) as src:
left, bottom, right, top = src.bounds
#Transform project coordinates to image coordinates
df["tile_xmin"] = df.minx - left
df["tile_xmin"] = df["tile_xmin"].astype(int)
df["tile_xmax"] = df.maxx - left
df["tile_xmax"] = df["tile_xmax"].astype(int)
#UTM is given from the top, but origin of an image is top left
df["tile_ymax"] = top - df.miny
df["tile_ymax"] = df["tile_ymax"].astype(int)
df["tile_ymin"] = top - df.maxy
df["tile_ymin"] = df["tile_ymin"].astype(int)
#Add labels is they exist
if "label" in gdf.columns:
df["label"] = gdf["label"]
else:
df["label"] = "Tree"
#add filename
df["image_path"] = os.path.basename(rgb)
#select columns
result = df[["image_path","tile_xmin","tile_ymin","tile_xmax","tile_ymax","label"]]
result = result.rename(columns={"tile_xmin":"xmin","tile_ymin":"ymin","tile_xmax":"xmax","tile_ymax":"ymax"})
image_name = os.path.splitext(os.path.basename(rgb))[0]
csv_filename = os.path.join(savedir, "{}.csv".format(image_name))
#ensure no zero area polygons due to rounding to pixel size
result = result[~(result.xmin == result.xmax)]
result = result[~(result.ymin == result.ymax)]
#write file
result.to_csv(csv_filename,index=False)
def project(raster_path, boxes):
"""
Convert image coordinates into a geospatial object to overlap with input image.
Args:
raster_path: path to the raster .tif on disk. Assumed to have a valid spatial projection
boxes: a prediction pandas dataframe from deepforest.predict_tile()
Returns:
a geopandas dataframe with predictions in input projection.
"""
with rasterio.open(raster_path) as dataset:
bounds = dataset.bounds
pixelSizeX, pixelSizeY = dataset.res
#subtract origin. Recall that numpy origin is top left! Not bottom left.
boxes["xmin"] = (boxes["xmin"] *pixelSizeX) + bounds.left
boxes["xmax"] = (boxes["xmax"] * pixelSizeX) + bounds.left
boxes["ymin"] = bounds.top - (boxes["ymin"] * pixelSizeY)
boxes["ymax"] = bounds.top - (boxes["ymax"] * pixelSizeY)
# combine column to a shapely Box() object, save shapefile
boxes['geometry'] = boxes.apply(lambda x: shapely.geometry.box(x.xmin,x.ymin,x.xmax,x.ymax), axis=1)
boxes = geopandas.GeoDataFrame(boxes, geometry='geometry')
boxes.crs = dataset.crs.to_wkt()
#Shapefiles could be written with geopandas boxes.to_file(<filename>, driver='ESRI Shapefile')
return boxes
@bw4sz
Copy link
Author

bw4sz commented Feb 7, 2020

I have tested this with utm shapefiles from the Northern Hemisphere. It may need to be tweaked for Southern (negative utms)?

@mjb-oz
Copy link

mjb-oz commented Aug 3, 2020

Hi there,
I'm just having a play with deepforest and the shapefiles_to_annotations tool.
I had to add a few lines of code to get it to work properly for me (at least, I think it's now working properly):

after line 23)
x_res, y_res = src.res # extract the pixel sizes for each dimension

Then divide through by the pixel sizes, so lines 26-38 become:

    #Transform project coordinates to image coordinates
    df["tile_xmin"] = df.minx - left
    df['tile_xmin'] = df['tile_xmin'] / x_res
    df["tile_xmin"] = df["tile_xmin"].astype(int)
    
    df["tile_xmax"] = df.maxx - left
    df['tile_xmax'] = df['tile_xmax'] / x_res
    df["tile_xmax"] = df["tile_xmax"].astype(int)
    
    #UTM is given from the top, but origin of an image is top left
    
    df["tile_ymax"] = top - df.miny
    df['tile_ymax'] = df['tile_ymax'] / y_res 
    df["tile_ymax"] = df["tile_ymax"].astype(int)
    
    df["tile_ymin"] = top - df.maxy
    df['tile_ymin'] = df['tile_ymin'] / y_res
    df["tile_ymin"] = df["tile_ymin"].astype(int) 

Hopefully this makes sense and is correct!

@bw4sz
Copy link
Author

bw4sz commented Aug 12, 2020

Thanks for this comment, i'm looking now. Yes, that makes sense. I had already done that in a previous step before this gist. I'll leave this comment here, it will depend on how the data is formatted. Good note. Feel free to drop some comments in issues on any other notes on deepforest, eager to here from users about how to make it most useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment