akhil-rana/[GSoC 2020] [caMicroscope] Pathology Algorithm Development Workbench - Final Report.md

## [GSoC 2020] [caMicroscope] Pathology Algorithm Development Workbench - Final Report.md

      
    Raw
  

              [GSoC 2020] [caMicroscope] Pathology Algorithm Development Workbench - Final Report.md
            
          
    Google Summer of Code 2020 - Work Product Submission

Student: Akhil Rana (@akhil-rana)  

Organisation: caMicroscope

Project: Pathology Algorithm Development Workbench 

Mentors: Ryan Birmingham, Nan Li and Pradeeban Kathiravelu
Overview

Made an interactive UI that lets users to create and train their own machine learning models (CNN) that can be used in caMicroscope's prediction tool. 
The model is trained either on the web browser or a web-server (node) using TensorflowJS.

The project has been implemented at 3 places:


caMicroscope - Pull Requests - Final branch merge PR

Caracal - Pull Requests

SlideLoader - Pull Requests


Feel free to look into the PRs for clear description/details of individual features.


The project consists of two main parts:

1. Dataset Generation

Since there should be consistent standard for dataset, so the dataset is generated first in a spritesheet format since it is a favorable type for TensorflowJS in case of images.

caMicroscope has a tool for labelling parts of slide images and save them. Those labelled files can be used to create a dataset. Or custom dataset can also be used in a certain format if required.

This was done on server side (Flask) using Pillow, Numpy at SlideLoader.

Corresponsng PR(s) can be found here.

2. Model Creation and Training

After the selection/creation of usable dataset, the user can proceed to creating and editing own CNN model.

The user can change basic settings (like classes, resolution of images, input/output layer features) for the model before starting the training.

After the user has set the basic things, the whole CNN model can be personalized like adding/removing layers with custom functions and function parameters. TFjs-vis is used to visualize the training process within the we-browser.

PR(s) for UI and browser training can be found here.

Server Side Model Training can be turned on which can be significantly faster and supports GPU training as well. It is performed on Caracal.

There is also an Advanced Mode which can be turned on from the options which is targeted towards more advanced users who might want to customize their models in more detailed fashion.

PR(s) for server-side training can be found here.

After the training is complete the trained model files can be saved to local system and can be used directly in caMicroscope's prediction tool.
User Guide

The complete user guide for using development workbench can be found here.
Blogs

In-depth explanation, project architecture can be found in these blogs:

Phase 1 update
Phase 2 update

A demo video is also available here

Future Scope (To-do)


Training Visualisation for server-side training
Memory Optimisation if possible
Quick model testing after training complete (if needed)


Thank You :)