Skip to content

Instantly share code, notes, and snippets.

@herbps10
Last active May 6, 2021 14:59
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save herbps10/9218f1488cb777a20882e243a2764fd1 to your computer and use it in GitHub Desktop.
Save herbps10/9218f1488cb777a20882e243a2764fd1 to your computer and use it in GitHub Desktop.
Annotated Slides

Annotated Slides

This is a webpage that shows a set of slides from a PDF along with annotations provided in a separate Markdown file. You can see this page live at: herbsusmann.com/paa2021.

To use this code for your own slides, you need to update the following code in index.html:

First, update the paths to the PDF and Markdown files:

var pdfUrl = './paa_2021_tmmps.pdf';
var textUrl = "./paa_2021.md";

Second, update the page title:

<title>PAA 2021: Temporal Models for Multiple Populations</title>

Third, update the page heading and PDF download link:

<div id='container' role="main" area-live="polite">
	<h2><abbr title="Population Association of America" aria-label="Population Association of America" style="speak:spell-out">PAA</abbr> 2021: Temporal Models for Multiple Populations</h2>
	<a href="./paa_2021_tmmps.pdf">Download Slides as PDF</a>

</div>

For the Markdown file, each slide annotation should be demarcated with a header of the form # Slide N: alt text. For example:

# Slide 1: Introduction
Annotation for Slide 1

# Slide 2: Background
Annotation for Slide 2

The resulting HTML will use "Introduction" and "Background" as the alternative text for slides 1 and 2, respectively.

<!doctype html>
<html class="no-js" lang="">
<head>
<meta charset="utf-8">
<title>PAA 2021: Temporal Models for Multiple Populations</title>
<meta name="description" content="">
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta property="og:title" content="">
<meta property="og:type" content="">
<meta property="og:url" content="">
<meta property="og:image" content="">
<style type='text/css'>
body {
font-family: sans-serif;
font-style: normal;
line-height: 1.4;
}
#container {
width: 725px;
margin: auto;
padding-top: 25px;
}
canvas {
border: 1px solid rgba(0, 0, 0, 0.1);
margin-top: 20px;
--tw-shadow: 0 4px 6px -1px rgba(0, 0, 0, 0.1), 0 2px 4px -1px rgba(0, 0, 0, 0.06);
box-shadow: var(--tw-ring-offset-shadow, 0 0 #0000), var(--tw-ring-shadow, 0 0 #0000), var(--tw-shadow);
}
abbr {
text-decoration: none;
}
</style>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/katex@0.13.3/dist/katex.min.css" integrity="sha384-ThssJ7YtjywV52Gj4JE/1SQEDoMEckXyhkFVwaf4nDSm5OBlXeedVYjuuUd0Yua+" crossorigin="anonymous">
<script src="https://cdn.jsdelivr.net/npm/katex@0.13.3/dist/katex.min.js" integrity="sha384-Bi8OWqMXO1ta+a4EPkZv7bYGIes7C3krGSZoTGNTAnAn5eYQc7IIXrJ/7ck1drAi" crossorigin="anonymous"></script>
<script src="https://cdn.jsdelivr.net/npm/katex@0.13.3/dist/contrib/auto-render.min.js" integrity="sha384-vZTG03m+2yp6N6BNi5iM4rW4oIwk5DfcNdFfxkk9ZWpDriOkXX8voJBFrAO7MpVl" crossorigin="anonymous"></script>
</head>
<body>
<div id='container' role="main" area-live="polite">
<h2><abbr title="Population Association of America" aria-label="Population Association of America" style="speak:spell-out">PAA</abbr> 2021: Temporal Models for Multiple Populations</h2>
<a href="./paa_2021_tmmps.pdf">Download Slides as PDF</a>
</div>
<script src="https://cdnjs.cloudflare.com/ajax/libs/pdf.js/2.8.335/pdf.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/marked/marked.min.js"></script>
<script>
// If absolute URL from the remote server is provided, configure the CORS
// header on that server.
// Paths to paper PDF and Markdown with annotations
var pdfUrl = './paa_2021_tmmps.pdf';
var textUrl = "./paa_2021.md";
// Loaded via <script> tag, create shortcut to access PDF.js exports.
var pdfjsLib = window['pdfjs-dist/build/pdf'];
// The workerSrc property shall be specified.
pdfjsLib.GlobalWorkerOptions.workerSrc = 'https://cdnjs.cloudflare.com/ajax/libs/pdf.js/2.8.335/pdf.worker.min.js';
var container = document.getElementById("container");
// Asynchronous download of PDF
var loadingTask = pdfjsLib.getDocument(pdfUrl);
loadingTask.promise.then(function(pdf) {
var numPages = pdf.numPages;
for(var pageNumber = 1; pageNumber <= numPages; pageNumber++) {
var canvas = document.createElement("canvas");
canvas.setAttribute("id", "slide-" + pageNumber);
canvas.setAttribute("height", 544);
canvas.setAttribute("width", 725);
canvas.setAttribute("role", "img");
var text = document.createElement("div");
text.setAttribute("id", "text-" + pageNumber);
container.append(canvas);
container.append(text);
}
fetch(textUrl)
.then(function(data) { return data.text() })
.then(function(res) {
var slides = res.split(/# Slide/);
console.log(slides);
for(var pageNumber = 1; pageNumber <= numPages; pageNumber++) {
if(slides[pageNumber] != undefined) {
var alt = slides[pageNumber].trim().match(/: (.+?)\n/)[1];
var slide = document.getElementById("slide-" + pageNumber);
slide.setAttribute("aria-label", alt);
slide.innerHTML = alt;
var html = marked(slides[pageNumber].trim().replaceAll('’', "'").replace(/[0-9]+?: (.+?)\n/, ""));
document.getElementById("text-" + pageNumber).innerHTML = html;
renderMathInElement(document.getElementById("text-" + pageNumber), { delimiters: [ { left: "$", right: "$", display: false }] });
}
}
});
for(var pageNumber = 1; pageNumber <= numPages; pageNumber++) {
pdf.getPage(pageNumber).then(function(page) {
var scale = 2;
var viewport = page.getViewport({ scale: scale });
// Prepare canvas using PDF page dimensions
var canvas = document.getElementById('slide-' + (page._pageIndex + 1));
var context = canvas.getContext('2d');
canvas.height = viewport.height;
canvas.width = viewport.width;
// Render PDF page into canvas context
var renderContext = {
canvasContext: context,
viewport: viewport
};
var renderTask = page.render(renderContext);
});
}
}, function (reason) {
// PDF loading error
console.error(reason);
});
</script>
</body>
</html>

Slide 1: Title

Presentation given at the Population Association of America (PAA) Annual Meeting 2021. Preprint available on ArXiv: https://arxiv.org/abs/2102.10020. Hello, my name is Herb Susmann from the University of Massachusetts Amherst, and I’ll be presenting joint work with Monica Alexander and Leontine Alkema. I’m going to describe our work developing a model class that unifies many existing models of demographic and health indicators in a common framework and makes it easier to document models and compare across them.

Slide 2: Resources

An annotated version of these slides is available online, and a preprint is available on ArXiv..

Slide 3: Background

There is growing interest in modeling demographic and health indicators, like the under-five mortality rate or maternal mortality rate, in order to track progress towards meeting international goals.

This is a tricky problem because data are not always available in every country and data are of varying quality. Statistical models are needed to combine all the existing data into estimates and projections with uncertainty.

Many different models have been created and published for various indicators, and sometimes for multiple models exist for the same indicator. But comparing across models is difficult because they each use different notation and conventions.

To help document models and the assumptions they make, as well as facilitate comparing across models, we propose an overarching model class that encompasses many existing approaches under one framework and notation, which we are calling Temporal Models for Multiple Populations (TMMPs.)

Slide 4: Under-Five Mortality Rate Models

As a case study, we’re going to look at two existing models of the under five mortality rate. On the left we have estimates of U5MR from a model created by the Institute for Health Metrics and Evaluation (Dicker 2018), and on the right are estimates from the UN-IGME model (Alkema and New 2014), both showing estimates of the U5MR in Senegal from 1950 to 2019. The models give differing estimates, so a natural question is how the models differ. Comparing them is difficult, though, because the published descriptions of the models use different notations.

Now I’m going to describe our model class, before coming back to these two models later to show how they can be written using our framework.

Slide 5: Temporal Models for Multiple Populations (TMMPs)

We call our model class “Temporal Models for Multiple Populations.” It provides a set of building blocks that can be used to construct models of demographic and health indicators.

Slide 6: Setup

Now I’m going to describe the setup for the framework, which is organized around a distinction between the observed data and the true, unobserved values of the indicator over time.

First, we define the true values of the indicator in each country and time point, which we call $\eta_{c,t}$.

The process model describes how we think these true values evolve over time, perhaps drawing on values of covariates or on temporal trends we expect to see in the indicator.

Then have some observed data yi, where each data point has some properties associated with it, like a country $c[i]$, time point $t[i]$, or data source $s[i]$.

The data model describes the relationship between the observed data and the truth; this is where we can model things like sampling errors and systematic biases.

Slide 7: Framework Structure

This diagram summarizes the high-level structure of the framework: we have the latent, true values which evolve over time according to the process model, and then the data model describes how the noisy observed data is generated from the process model.

In our work so far we’ve focused primarily on understanding the structure of the process model, which we will go into next.

Slide 8: Process Model Structure

In our framework, we break the process model into multiple building blocks.

On the left hand side, we have the true value of the indicator $\eta_{c,t}$, possibly transformed.

On the right hand side, we have four components:

  • The covariate component is a regression function that allows introduction of covariates.
  • The systematic component allows us to model systematic trends that we expect the indicator to follow.
  • The offset can be used to bring in additional outside information, for example from a separate modeling step.
  • Finally the smoothing component is used to allow data-driven deviations from the expected trend given by the other components.

Let’s go through each one of these components one by one.

Slide 9: Covariate Component

The covariate component is typically some kind of regression function. For example, the IHME U5MR model uses a non-linear regression function of lag-distributed income per capita, educational achievement, and child HIV mortality rate.

Slide 10: Systematic Component

The systematic component is useful if we expect the indicator to follow some parametric function over time.

For example, Cahill 2018 uses the systematic component to model the rate of change of family planning adoption as following country specific logistic growth curves.

Slide 11: Offset Term

The offset term allows for the use of external information. As we’ll see later, the IHME U5MR model uses the offset term to bring in smoothed residuals from a separate modeling step.

Slide 12: Smoothing Component

The last term, $\epsilon_{c,t}$, models trends not captured in the other components, while still enforcing some degree of smoothness on the resulting estimates.

Existing models use a lot of different approaches for the smoothing component, like B-splines, Gaussian Processes, autoregressive processes and random walks, and spatio-temporal smoothing methods.

We added some additional structure to our model class that captures many of these approaches under a common notation.

Slide 13: Smoothing Component Details

First, we define a vector $\boldsymbol{\epsilon}_c$ of all the deviations for a country. This vector can be decomposed into the product of a full rank matrix $\boldsymbol{B}_c$ and coefficients $\boldsymbol{\delta}_c$. $\boldsymbol{B}_c$ allows for dimensionality reduction, for example through splines. We then require that the coefficients $\boldsymbol{\delta}_c$ are multivariate normally distributed with mean zero after $r$ levels of differencing, where the covariance matrix is given by some autocovariance function $s$.

Slide 14: Smoothing Component Examples

This structure encompasses many popular smoothing models. For example, we can recover an AR(1) process, a second order random walk, and a Gaussian process with Matern covariance kernel by choosing the right values for the level of differencing $r$ and autocovariance function $s$.

Slide 15: Conditional behavior of smoothing components in projections

A useful way to analyze these smoothing models is to look at their conditional behavior in projections. These plots show how various smoothing models behave in projections when conditioned on two prior values: we can see, for example, how a first-order random walk simply extends forward the last data point, while the second-order random walk extends a linear trend.

Slide 16: Parameter estimation and hierarchical modeling

Each of these components introduces parameters that need to be estimated, with a typical model having hundreds or thousands of parameters.

Hierarchical modeling is a powerful method used in many models to share information about parameters between units, which is helpful especially in data spare settings. Because this introduces extra assumptions, it’s important to clearly document it, which can be done within the TMMP framework. Here is a table from our paper that compares how the IHME model (left) and UN-IGME model (right) estimate their parameters, including their use of hierarchical modeling.

Slide 17: UN-IGME model of U5MR

Now let’s return to our case study of the two U5MR models, and see how each one of them fits into our model class.

The UN-IGME model has a process model with systematic and smoothing components.

The systematic component models an intercept and a linear trend.

The smoothing component uses cubic B-splines with knots placed evenly over each country’s observation period. This can be written using the $\boldsymbol{B}_c$ matrix. The spline coefficients are modeled as following a second order random walk, which means the $\boldsymbol{\delta}_c$ are normally distributed with mean zero after two levels of differencing.

Slide 18: IHME model of U5MR

The IHME U5MR process model has a covariate component, offset, and smoothing component.

The covariate component is a non-linear regression function of income per capita, educational achievement, and child HIV mortality rate.

The offset term adjusts the covariate component using smoothed residuals from a separately estimated mixed-effects model.

The smoothing component is a Gaussian process with a Matérn covariance function.

Slide 19: Table comparing IHME and UN-IGME models

Casting each one of the models in our notation provides a foundation for comparing the two models. In our paper, we document each model in TMMP notation and present a side by side comparison of each TMMP component.

Slide 20: Additional examples

Our paper includes several other examples of existing models rewritten to use the TMMP notation. We also provide a template table listing each element of the framework that can be filled in to document models using the TMMP notation.

Slide 21: Summary

To summarize, in this presentation we introduced Temporal Models for Multiple Populations, a model class that encompasses many existing models of demographic and health indicators.

A key feature of the model class is its distinction between the process model and the data model, which allows for models to separate how the true values evolve from how the data are generated.

We then focused on the process model, separating it into four components: covariates, systematic trends, offsets, and smoothing.

The model class can be used to document models under a consistent notation. This is in the spirit of transparency -- we’re not saying one model is better than the other, rather that it’s useful to have a shared language to understand different approaches. Eventually a standardized documentation step could be considered for inclusion in the GATHER model reporting guidelines.

Once models are documented in the same notation, we also have a consistent starting point for comparing across models, like the two U5MR models we looked at today.\

Finally, we have found the model class useful as a tool for guiding the development of new models. The model class helps us understand the universe of possible models and organize how we want to explore them by constructing different combinations of process model components.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment