Skip to content

Instantly share code, notes, and snippets.

@sooheang
Last active July 20, 2021 17:42
Show Gist options
  • Save sooheang/2b1eb9b6f3a52b4ca9c6d62ebad3dce4 to your computer and use it in GitHub Desktop.
Save sooheang/2b1eb9b6f3a52b4ca9c6d62ebad3dce4 to your computer and use it in GitHub Desktop.
Explainable Deep Learning

Explainable Deep Learning

Overview of Explainable Deep Learning

Three major research directions in explainable deep learning: understanding, debugging, and refinement/steering

Model understanding

aims to explain the rationale behind model predictions and the inner workings of deep learning models, and it attempts to make these complex models at least partly understanding

  • Perturbation experiments (CVPR2014): Large Convolutional Network models have recently demonstrated impressive classification performance on the ImageNet benchmark. However there is no clear understanding of why they perform so well, or how they might be improved. In this paper we address both issues. We introduce a novel visualization technique that gives insight into the function of intermediate feature layers and the operation of the classifier. We also perform an ablation study to discover the performance contribution from different model layers. This enables us to find model architectures that outperform Krizhevsky \etal on the ImageNet classification benchmark. We show our ImageNet model generalizes well to other datasets: when the softmax classifier is retrained, it convincingly beats the current state-of-the-art results on Caltech-101 and Caltech-256 datasets.

  • Saliency map-based methods (ICLR2014): This paper addresses the visualisation of image classification models, learnt using deep Convolutional Networks (ConvNets). We consider two visualisation techniques, based on computing the gradient of the class score with respect to the input image. The first one generates an image, which maximises the class score [Erhan et al., 2009], thus visualising the notion of the class, captured by a ConvNet. The second technique computes a class saliency map, specific to a given image and class. We show that such maps can be employed for weakly supervised object segmentation using classification ConvNets. Finally, we establish the connection between the gradient-based ConvNet visualisation methods and deconvolutional networks [Zeiler et al., 2013].

  • LIME(ACM SIGKDD2016): Despite widespread adoption, machine learning models remain mostly black boxes. Understanding the reasons behind predictions is, however, quite important in assessing trust, which is fundamental if one plans to take action based on a prediction, or when choosing whether to deploy a new model. Such understanding also provides insights into the model, which can be used to transform an untrustworthy model or prediction into a trustworthy one. In this work, we propose LIME, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning an interpretable model locally around the prediction. We also propose a method to explain models by presenting representative individual predictions and their explanations in a non-redundant way, framing the task as a submodular optimization problem. We demonstrate the flexibility of these methods by explaining different models for text (e.g. random forests) and image classification (e.g. neural networks). We show the utility of explanations via novel experiments, both simulated and with human subjects, on various scenarios that require trust: deciding if one should trust a prediction, choosing between models, improving an untrustworthy classifier, and identifying why a classifier should not be trusted.

  • Influence functions(ICML2017): How can we explain the predictions of a black-box model? In this paper, we use influence functions -- a classic technique from robust statistics -- to trace a model's prediction through the learning algorithm and back to its training data, thereby identifying training points most responsible for a given prediction. To scale up influence functions to modern machine learning settings, we develop a simple, efficient implementation that requires only oracle access to gradients and Hessian-vector products. We show that even on non-convex and non-differentiable models where the theory breaks down, approximations to influence functions can still provide valuable information. On linear models and convolutional neural networks, we demonstrate that influence functions are useful for multiple purposes: understanding model behavior, debugging models, detecting dataset errors, and even creating visually-indistinguishable training-set attacks.

Model debugging through visualisation toolkits

the process of identifying and addressing defects or issues within a deep learning model that fails to converge or does not achieve an acceptable performance

  • TensorBoard visualizes the structure of a given computational graph that a user creates and provides basic line graphs and histograms of user-selected statistics.

  • Visdom is a web-based interactive visualization toolkit that is easy to use with deep learning libraries for PyTorch.

  • DL4J UI allows users to monitor the training process with several basic visualization components.

  • DIGITS simplifies common deep learning tasks such as managing data, designing and training neural networks on multi-GPU systems, monitoring performance in real time with advanced visualizations, and selecting the best performing model from the results browser for deployment.

Model refinement using visual analytics

a method to interactively incorporate expert knowledge and expertise into the improvement and refinement process of a deep learning model, through a set of rich user interactions, in addition to semi-supervised learning or active learning

  • CNNVis

  • ActiVis provides a visual exploratory analysis of a given deep learning model via multiple coordinated views, such as a matrix view and an embedding view. [slideshare]

  • LSTMVis allows a user to select a hypothesis input range to focus on local state changes, to match these states changes to similar patterns in a large data set, and to align these results with structural annotations from their domain. [github]

  • DGMTracker is a visual analytics tool that helps experts understand and diagnose the training processes of deep generative models.

  • GANViz aims to help experts understand the adversarial process of GANs in-depth. Specifically, GANViz evaluates the model performance of two subnetworks of GANs, provides evidence and interpretations of the models’ performance, and empowers comparative analysis with the evidence

  • Ensemble Viz

References

Challenges

  • Sponsors: FICO, Google, UC Berkely, Univ. of Oxford, MIT, and UC Irvine
  • Date: Apr 2018 ~
  • Dataset: Home Equity Line of Credit (HELOC) Dataset
    • Goal: use the information about the applicant in their credit report to predict whether they will repay their HELOC account within 2 years
    • A HELOC is a line of credit typically offered by a bank as a percentage of home equity (the difference between the current market value of a home and its purchase price). The customers in this dataset have requested a credit line in the range of $5,000 - $150,000. The fundamental task is to use the information about the applicant in their credit report to predict whether they will repay their HELOC account within 2 years. This prediction is then used to decide whether the homeowner qualifies for a line of credit and, if so, how much credit should be extended.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment