Skip to content

Instantly share code, notes, and snippets.

@Nageshbansal
Last active April 4, 2023 09:39
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save Nageshbansal/9f2856fa9e4ca0170e2f1a9000076541 to your computer and use it in GitHub Desktop.
Save Nageshbansal/9f2856fa9e4ca0170e2f1a9000076541 to your computer and use it in GitHub Desktop.
Google Summer of Code'22 Final Product

Google Summer of Code 2022 Final Work Product

gsoc retriever

Proposed Objectives

  • Querying the NEON-DATA-API
  • Retrieval of Vegetation Structure (VST) Data
  • Translation VST related functions from neonVegWrangleR
  • Retrieval of Airborne Observation Platform (AOP) Data
  • Refactor AOP Functions from neonVegWrangleR

Objectives Completed

1. Querying the NEON-DATA-API

The First Task of this Pipeline was to query the Neon data API to retrieve the datasets. In the NeonVegWrangleR package, We were using the neon-utilities package, But there is no python wrapper available for this package till now, so this issue can be solved in two ways:

  • Refactoring of the neon-utilities package function 
  • Usage of the R functions in Python Package with The help of the rpy2 package.
    This pipeline proceeds with the (1) method, and functions like loadByProduct , zipsByProduct, stackByTables, byTileAOP were refactored succesfully.

These functions are meant to download the Vegetation Structure Data and Airborne Observation Platform for now but We have been talking about making it in a way it can be generalized to the other data-products, also.

Pull Requests:

Issues:

2. Retrieval of Vegetation Structure(VST) Data

The Vegetation Structure Data can be retrieved by using load_by_product() function or by using the zips_by_product() and stack_by_table() functions simultaneously. Tests and docs for these functions were also added, and Tutorials for using these function were also created. Pull Requests:

3. Translates VST related functions from neonVegWrangleR

Refactoring of the VST Related functions from neonVegWrangleR package were done in python. The functions that were part of this was : retrieve_VST_Data, retrieve_coords_itc, retrieve_dist_to_utm. These functions helps in retrieving the VST data using the load_by_product funtion, Adding the UTM coordinates of vst entries based on azimuth and distance and merging of these coordinates with the apparent_individual entries.

Pull Requests:

4. Retrieval of Airborne Observation Platform (AOP) Data

The Airborne Observation Platform Data can be downloaded using the by_tile_aop() function. I refactored the 'byTileAOP()' function along with get_tile_urls() function from the neon-utilities package.

Pull Requests:

5. Refactor AOP Functions from neonVegWrangleR

Refactoring of the AOP Related functions from neonVegWrangleR package were done in python. The functions that were part of this was: 

  • retrieve_AOP_Data : This function helps in retrieving the AOP data using the by_tile_aop() funtion for the given indivdual vegetation structure data coordinates.
  • crop_plot_data(): This function helps create a shapefile out of vegetation structure data with lat/lon coordinates. and after that, it applies the clip_plot() function to the clip plot from AOP data using this shapefile.

Pull Requests:

Objectives in Progress

  • Clip_plot function : In NeonVegWrangleR , clip_plot function clips plots around NEON VST data. it works on following types of data : 1. raster (.tif) 2. lidar point cloud (.laz) 3. Hierarchical data (.h5). It includes following functions:
    • Clip_raster()
    • Clip_lidar()
    • Clip-hdf

Pull requests: weecology/neonwranglerpy#34

  • In crop_plot_data function, parallel processing needs to be implemented as applying the clip_plot function over the entire dataframe in a synchronous way is not a optimized way.

Other Objectives

1. Unit Testing of the Package

Unit Tests of package was done by creating tests for functions like loadByProduct, zipsByProduct, stackByTables, byTileAOP, retrieve_aop_data, retrieve_vst_data succesfully. Pull Requests:

2. Documentation and Setting Up CI/CD

Pull Requests:

Future Work

The goal of the project was to implement a Python version of NeonVegWrangleR package, used for integrating the Neon Vegetation Structure (VST) and Airborne Observation Platform (AOP) Data. Only Vegetation Structure Data and Airborne Observation Platform Data have been integrated to the Package but We have been talking about making it in a way it can be generalized to the other data-products, also. I plan to do following work in future:

  • Implementation for the CFC dataset ( DP1.10026.001): We can also add support for CFC data, which is generally equivalent to VST data. To deal with CFC data, we need to set up a pipeline as we did for VST.

  • Refactoring of the function get_lat_lon.R: This function will help us calculate latitude and longitude values for each stem in the VST data. As for now, we’re not calculating latitude and longitude separately.

  • Asynchronous Downloading and Processing of Data can be implemented to make the pipeline more faster.

I plan to continue contributing more to Data Retriever and neonwranglerpy after GSoC'22 and become an active contributor for the repository.

Tutorials and Blogs

During the GSoC Period, my mentor Henry Senyondo and Sergio Marconi motivated me to write blogs and tutorials explaning my work in this project and new Tech stacks such as apache web servers, packaging in cpp etc.

Description Blog Link
GSoC’22: Community Bonding Period. Blog
GSoc'22 : Setting Up Project Link
GSoc'22 : Querying the NEON-DATA-API Link
GSoc'22 : Retrieval of Vegetation Structure (VST) Data Link
GSoc'22 : Translation VST related functions from neonVegWrangleR Link
Configuration of Apache Web Server Link
Creating a C++ Package Link

For me, the last three months have been an incredible learning experience, and I am grateful for everything I've learned. I learnt CI/CD using Docker and Github Actions, interfacing between R and Python, and using REST APIs. The entire experience has really aided my overall development as a developer, and I can confidently state that this has been the most fruitful summer of my life!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment