Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save akhil-rana/ab5d5280728629865b4578f672fe69d4 to your computer and use it in GitHub Desktop.
Save akhil-rana/ab5d5280728629865b4578f672fe69d4 to your computer and use it in GitHub Desktop.
[GSoC 2020] [caMicroscope] Pathology Algorithm Development Workbench - Final Report

Google Summer of Code 2020 - Work Product Submission

Student: Akhil Rana (@akhil-rana)
Organisation: caMicroscope
Project: Pathology Algorithm Development Workbench
Mentors: Ryan Birmingham, Nan Li and Pradeeban Kathiravelu

Overview

Made an interactive UI that lets users to create and train their own machine learning models (CNN) that can be used in caMicroscope's prediction tool.
The model is trained either on the web browser or a web-server (node) using TensorflowJS.
The project has been implemented at 3 places:

Feel free to look into the PRs for clear description/details of individual features.


The project consists of two main parts:

1. Dataset Generation

Since there should be consistent standard for dataset, so the dataset is generated first in a spritesheet format since it is a favorable type for TensorflowJS in case of images.
caMicroscope has a tool for labelling parts of slide images and save them. Those labelled files can be used to create a dataset. Or custom dataset can also be used in a certain format if required.
This was done on server side (Flask) using Pillow, Numpy at SlideLoader.

Corresponsng PR(s) can be found here.

2. Model Creation and Training

After the selection/creation of usable dataset, the user can proceed to creating and editing own CNN model.
The user can change basic settings (like classes, resolution of images, input/output layer features) for the model before starting the training.
After the user has set the basic things, the whole CNN model can be personalized like adding/removing layers with custom functions and function parameters. TFjs-vis is used to visualize the training process within the we-browser.

PR(s) for UI and browser training can be found here.

Server Side Model Training can be turned on which can be significantly faster and supports GPU training as well. It is performed on Caracal.
There is also an Advanced Mode which can be turned on from the options which is targeted towards more advanced users who might want to customize their models in more detailed fashion.

PR(s) for server-side training can be found here.

After the training is complete the trained model files can be saved to local system and can be used directly in caMicroscope's prediction tool.

User Guide

The complete user guide for using development workbench can be found here.

Blogs

In-depth explanation, project architecture can be found in these blogs:

A demo video is also available here

Future Scope (To-do)

  • Training Visualisation for server-side training
  • Memory Optimisation if possible
  • Quick model testing after training complete (if needed)

Thank You :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment