Skip to content

Instantly share code, notes, and snippets.

@lb-97
Last active December 11, 2023 12:34
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save lb-97/57347e7d06d87a0aa3b77887631f33bc to your computer and use it in GitHub Desktop.
Save lb-97/57347e7d06d87a0aa3b77887631f33bc to your computer and use it in GitHub Desktop.
gsoc fury

Google Summer of Code Final Work Product

Proposed Objectives

  • Human Brain MRI preprocessing function
  • MRI reconstruction using VQVAE
  • Implement & train Diffusion Model on VQVAE latents
  • Implement conditional Diffusion Model
  • Generate conditional synthetic MRI
  • Evaluate synthetic generations in DIPY

Modified Objectives(Additional)

  • 2D VQVAE on MNIST data
  • 2D unconditional DDPM based LDM on MNIST data
  • 3D VQVAE based on MONAI's PyTorch implementation
  • 3D unconditional LDM based on MONAI's PyTorch implementation

Objectives Completed

  • Conducted Literature Review on limited existing diffusion modeling in Medical Imaging
    • Current literature [1, 2] utilized VQGAN& DDPM models on MRNet, ADNI, Breast Cancer MRI, lung CT datasets
    • MONAI is the latest open-source platform with repositories on deep learning applications on BRATS & other medical imaging datasets, implemented in Pytorch
    • Our project serves as a source for easy, understandable & accessible implementation of anatomical MRI generation using unconditional Diffusion Modelling in Tensorflow
  • Implemented 2D VQVAE & 2D DDPM based Latent Diffusion Model(LDM) on MNIST dataset & achieved perfect generations
  • Worked on CC359 & NFBS datasets, both consist of T1-weighted human brain MRI with 359 & 125 samples respectively. Preprocessed each input volume following the 3 steps below-
    • Skull-stripping the dataset, if required, using existing masks.
    • Pre-process using transform_img function - perform voxel resizing & affine transformation to obtain final (128,128,128,1) shape & (1,1,1) voxel shape
    • Neutralized background pixels to 0 using respective masks
    • MinMax normalization to rescale intensities to (0,1)
  • Implemented 3D versions of the above repositories from scratch
    • VQVAE3D

      • The encoder & decoder of 3D VQVAE are symmetrical with 3 Convolutional & 3 Transpose Convolutional layers respectively, followed by non-linear relu units
      • Vector Quantizer trains a learnable embedding matrix to identify closest latents for a given input based on L2 loss function
      • VQVAE gave superior results over VAE as shown in this paper, owing to the fact that quantizer addresses the problem of 'Posterior Collapse' seen in traditional VAEs
      • Trained the model for approximately 100 epochs using Adam optimizer with lr=1e-4, minimized reconstruction & quantizer losses together
      • Test dataset reconstructions-

      VQVAE reconstructions on NFBS test dataset

    • 3D LDM

      • Built unconditional Latent Diffusion Model(LDM) combining DDPM & Stable Diffusion implementations
      • U-Net of the reverse process consists of 3 downsampling & 3 upsampling layers each consisting of 2 residual layers and an optional attention layer
      • Trained the model using linear (forward)variance scaling & various diffusion steps - 200, 300
      * Adopted algorithm 4 for sampling synthetic generations at 200 & 300 diffusion steps-.. image:: https://github.com/dipy/dipy/blob/master/doc/_static/dm3d-reconst-D200-D300.png
      alt

      3D LDM synthetic generations

      width

      800

  • Adopted MONAI's implementation
    • Replaced VQVAE encoder & decoder with a slightly complex architecture that includes residual connections alternating between convolutions
    • Carried out experiments with same training parameters with varying batch sizes & also used both datasets in a single experiment

      VQVAE-MONAI training plots

    • Clearly the training curves show that the higher batch size & dataset length, the better the stability of the training metric for learning rate=1e-4
    • Plotted reconstructions for top two experiments - (Batch size=12, Both datasets) & (Batch size=5, NFBS dataset)

      VQVAE-MONAI reconstructions on best performing models

    • Existing diffusion model has been trained on these new latents to check for their efficacy on synthetic image generation
    • The training curves converged pretty quickly, but the sampled generations are still pure noise

      3D LDM training curve for various batch sizes & diffusion steps

    • To summarize, we've stretched the capability of our VQVAE model despite being less complex with only num_res_channels=(32, 64). We consistently achieved improved reconstruction results with every experiment. Our latest experiments are trained using a weighted loss function with lesser weight attached to background pixels owing to their higher number. This led to not just capturing the outer structure of a human brain but also the volumetric details resembling microstructural information inside the brain. This is a major improvement from all previous trainings.
    • For future work we should look into two things - debugging Diffusion Model, scaling VQVAE model.
      • As a first priority, we could analyze the reason for pure noise output in DM3D generations, this would help us rule out any implementation errors of the sampling process.
      • As a second step, we could also try scaling up both VQVAE as well as the Diffusion Model in terms of complexity, such as increasing intermediate channel dimensions from 64 to 128 or 256. This hopefully may help us achieve the state-of-art on NFBS & CC359 datasets.

Objectives in Progress

  • Unconditional LDM hasn't shown any progress in generations yet. Increasing model complexity with larger number of intermediate channels & increasing diffusion steps to 1000 is a direction of improvement
  • Implemented cross-attention module as part of U-Net, to accommodate conditional training such as tumor type, tumor location, brain age etc
  • Implementation of evaluation metrics such as FID(Frechet Inception Distance) & IS(Inception Score) will be useful in estimating the generative capabilities of our models

Timeline

Date Description Blog Post Link
Week 0
(19-05-2023)
Journey of GSOC application & acceptance DIPY
Week 1
(29-05-2023)
Community bonding and Project kickstart DIPY
Week 2
(05-06-2023)
Deep Dive into VQVAE DIPY
Week 3
(12-06-2023)
VQVAE results and study on Diffusion models DIPY
Week 4
(19-06-2023)
Diffusion research continues DIPY
Week 5
(26-06-2023)
Carbonate HPC Account Setup, Experiment, Debug and Repeat DIPY
Week 6 & Week 7
(10-07-2023)
Diffusion Model results on pre-trained VQVAE latents of NFBS MRI Dataset DIPY
Week 8 & Week 9
(24-07-2023)
VQVAE MONAI models & checkerboard artifacts DIPY
Week 10 & Week 11
(07-08-2023)
HPC issues, GPU availability, Tensorflow errors: Week 10 & Week 11 DIPY
Week 12 & Week 13
(21-08-2023)
Finalized experiments using both datasets DIPY
@pjsjongsung
Copy link

Great! Just a few comments.

  1. 'Current literature ~' is duplicated

  2. 'Skull stripping using STAPLE ~' sounds like you were the one doing STAPLE. We should say the dataset provided it. Also, was CC359 STAPLE? I thought NFBS was STAPLE. Can you double check?

  3. If you say 'Carbonate' it does not ring a bell for anyone outside of IU. It should say something like HPC(High performance computing systems) or just GPU

  4. It says on the timeline only NFBS dataset. When did you work on CC359?

  5. What happened to the link on week8&9 and 12&13 on the timeline?

  6. Some say VQ-VAE and some VQVAE

  7. Proposed objective should include implementing it in DIPY.

Overall, I think it's super, but be careful using terms that we are used to but others might not be (e.g. what is transform_img function?). Some minor errors needs cleaning up(e.g. the date in the timeline is same for week 6~13, why is relu gray boxed, typos). Try checking typos on places such as word as shillipi suggested.

Also I think you have some place holders (e.g. --epochs). Don't forget to fill them out!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment