Skip to content

Instantly share code, notes, and snippets.

@himanshu-02
Last active October 23, 2023 16:45
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save himanshu-02/a0b49ea7a2fe9a81f0ef84974a0be346 to your computer and use it in GitHub Desktop.
Save himanshu-02/a0b49ea7a2fe9a81f0ef84974a0be346 to your computer and use it in GitHub Desktop.
GSoCReport'23

Final submission report for GSoC 2023

Details

Name Himanshu Chougule
Organization INCF
Mentors Bradly Alicea, Jiahang Li Mayukh Deb , Jessie Parent
Project Title D-GNNs: developing DevoGraph for computational developmental biology
Project page https://github.com/devoworm/GSoC-2023

Project Description

D-GNNs: developing DevoGraph for computational developmental biology

This project is about using topological data analysis over Graph Neural Networks(GNNs) as a method to discover underlying connectivity to characterize a growing network that undergoes shape as well as size transformations for C. elegans. There are two main parts to this project, all of which aim to integrate previous work on embryo networks, developmental connectomes and embryo differentiation.

  1. An instance segmentation pipeline
  2. Topological data analysis over stage-2 of DevoGraph to capture the underlying persistence features.

The instance segmentation task was successfully completed using the watershed algorithm over the semantic segmentation results using DevoLearn. A Mask-RCNN finetuning approach is proposed as well. Topological data analysis was carried out over different microscopy data of the nematode Caenorhabditis elegans. The persistence diagram plots inferred from the C. elegans data is analyzed in their respective notebooks.

Instance segmentation

So what is Instance Segmentation? How is it different from standard segmentation techniques? How is it better for microscopy images? Instance segmentation is a computer vision task that goes beyond standard segmentation techniques. It involves not only categorizing objects in an image but also distinguishing and delineating individual instances of each object class. It gives us better outputs than standard semantic segmentation techniques by giving us precise object identification for cell counting and tracking.

Instance segmentation is achieved by variety of techniques. The methods I researched on were using Mask-RCNN and Watershed segmentation. Mask R-CNN (Mask Region-based Convolutional Neural Network) is a deep learning techniquee used for object instance segmentation, which combines the tasks of object detection and pixel-level image segmentation. It's an extension of the popular Faster R-CNN architecture, with an additional branch for predicting pixel-level masks for each detected object.

We find the bounding boxes of the cells to incorporate it in our Mask R-CNN Pipeline (code given)

bounding boxes

Watershed segmentation is an image processing technique that identifies object boundaries by treating the image as a topographic map. It works by considering pixels as elevation values and flooding the "landscape" from multiple markers (seed points). Regions associated with the markers are separated by watershed lines, providing a segmentation that can be useful for separating touching objects or delineating structures in an image.

Here are some more outputs!

  1. The semantic segmentation image

devolearn-seg

  1. Instance segmentation using watershed algorithm

instance-red-edge

instance-seg-diff-color

Topological data analysis

Topological data analysis (TDA) is a branch of mathematics and data analysis that aims to uncover the underlying topological properties and features in complex data sets. It is particularly useful for understanding and characterizing the shape, structure, and connectivity of data, especially in high-dimensional spaces.

I applied TDA Concepts mainly persistent homology and lower star filtration on various C. elegans datasets.

Persistent homology is a technique used to study and quantify the topological features of data. The main idea is that we create a series of simplical complexes (Filtration) to represent topological features. Then their homology groups are computed (mathematically theres are the "holes" and connected components in the data). Finally we track the appearence and disappearence of these features and visualize it using a persistence diagram.

Here are some key outputs!

This is the graph generated using stage 2 of devograph

stage2-output

This is the corresponding persistence diagram. For a particular instance of temporal graph dataset the nodes do not change any edge features (the VietorisRipsPersistence diagram tends to infinity so we get a unique graph!)

pd-stage2-graph

This is tda on the raw data provided in the dataset folder. We first take the raw data and map it such that we can extract persistence features from it.

3d-rawdata

2d-rawdata

pd-rawdata

Similarly Lower Star Filtration is used to analyze the topological features. For example it can be used as an approach to identifying the cells even though they have very different shapes.

This is the persistence diagram of the above image.

pd-lower-star

This is the corresponding output of applied lower star filteration. As you can see it can correctly identify the key features of our image

lower-star-filter

Future Scope

  • Working on the MaskRCNN pipeline to give better results.
  • Annotating the dataset used so that it can be used for more complex models that are transformer based.
  • Extracting more tda embeddings over the graphs dynamically

Conclusion and Acknowledgement

During the GSoC program, I learned about image segmentation and topological data analysis. It was like going on an exciting research adventure that spanned different fields. Overcoming the challenges was tough but incredibly rewarding. I want to express my sincere thanks to my mentors who were there to guide me every step of the way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment