Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?
creating an animated (gif) world map of life expectancy using ggplot2

This gist shows how to create an animated world map of life expectancy using R. The data comes the UN World Population Prospects, 2015 Revision, and it brings life expectancy data from 1950 untill 2015 and projeceted data up to 2100. Thanks Topi Tjukanov, who reminded me of the UN DESA data portal, where you can find this dataset and many others

The idea is to use open data to create a GIF, much like the ones created by Aron Strandberg but his maps look much nicer. The output of this script is a map like this one:

enter image description here

Now diving into the code. First, let's load the necessary libraries and get the data. As of this writing, the current version of gganimate has a bug that messes the aesthetics of the .gif file. As a temporary solution, I've intalled an older version of the package, as recommended by the author of the gganimate, David Robinson.

# load libraries
  library(curl)
  library(readxl)
  library(data.table)
  library(rworldmap)
  library(ggplot2)
  library(dplyr)
  library(tweenr)
  library(ggthemes)
  library(viridis)
  library(rgeos)
  library(countrycode)
  library(devtools)
  install_github("dgrtwo/gganimate", ref = "26ec501")
  library(gganimate)

# DATA: Download data
  URL <- "https://esa.un.org/unpd/wpp/DVD/Files/1_Indicators%20(Standard)/EXCEL_FILES/3_Mortality/WPP2015_MORT_F07_1_LIFE_EXPECTANCY_0_BOTH_SEXES.XLS"
  curl_download(url=URL, destfile="LIFE_EXPECTANCY_0_BOTH_SEXES.xls", quiet=FALSE, mode="wb")

1. Read and organize life expectancy data

The dataset comes in an Excel file. One spreadsheet brings life expectancy data for years before 2015 and another spreadsheet brings projected life expectancy up to 2100. Now we need to read and merge these two datasets. We are also going to use country codes (iso3) instead of country names names to amke it easier latter to join the life expect. info with spatial data. This can be done using the countrycode library, recommended by Noam Ross

# Read historic data
  df1 <- read_excel( path= "LIFE_EXPECTANCY_0_BOTH_SEXES.xls", sheet="ESTIMATES", skip = 16)
# Read projected data
  df2 <- read_excel( path= "LIFE_EXPECTANCY_0_BOTH_SEXES.xls", sheet="MEDIUM VARIANT", skip = 16)
  
# Merge historic and projected data
  setDT(df2)[, Notes := NULL][, Variant := NULL]
  df <- left_join(df1, df2, by=c("Index", "Major area, region, country or area *", "Country code"))
  setDT(df)

# change name of column
  colnames(df)[3] <- c("country")

# drop first 14 rows with information aggregated for continents etc
  df <- df[-c(1:14),]
  
# get standard country codes for latter merge with spatial data
  df[, country_iso3c := countrycode(country, 'country.name', 'iso3c')]
  
  # quick note: these UN data don't bring life expectancy data for Greenland, Kosovo, San Marino, Somaliland and Taiwan.

Note that the data is organized in wide format, where each year is a separete column. In order to make our animated map, we are going to reshape the data to long format, piling up all observations. melt{data.table} makes this task really simple and fast.

# data in wide format
head(df)
#>    Index   Variant  country Notes Country code 1950-1955 1955-1960 1960-1965 ...
#> 1:    15 Estimates  Burundi    NA          108    39.031    40.529    42.036 ...
#> 2:    16 Estimates  Comoros    NA          174    38.720    40.465    42.472 ...
#> 3:    17 Estimates Djibouti    NA          262    41.038    42.949    45.176 ...
#> 4:    18 Estimates  Eritrea    NA          232    35.840    36.846    37.983 ...
#> 5:    19 Estimates Ethiopia    NA          231    34.080    36.675    40.080 ...

# Reshape to long format
   # get name of columns with years of reference
    year_cols <- colnames(df)[6:35]
  
  # Reshape data
    dt <- melt(df, id.vars = c("country", "country_iso3c"),
               measure.vars = year_cols,
               variable.name= "year",
               value.name= "life_expect")
               
# data in long format
head(dt)
#>     country country_iso3c      year life_expect
#> 1:  Burundi           BDI 1950-1955      39.031
#> 2:  Comoros           COM 1950-1955      38.720
#> 3: Djibouti           DJI 1950-1955      41.038
#> 4:  Eritrea           ERI 1950-1955      35.840
#> 5: Ethiopia           ETH 1950-1955      34.080
#> 6:      ...           ...       ...         ...

# get Min and Max values of life expectancy
  vmax <- max(dt$life_expect, na.rm=T)
  vmin <- min(dt$life_expect, na.rm=T)

2. Get Spatial data

# get world map
  wmap <- getMap(resolution="low")
  
  # small edits
  wmap <- spTransform(wmap, CRS("+proj=robin")) # reproject
  wmap <-   subset(wmap, !(NAME %like% "Antar")) # Remove Antarctica

We also need to get a pair of coordinates for each country, which will indicate on the map where we will place the tex with life expectancy values. I adopted a simple solution using the centroids of each country. This is far from perfct because some strings will look a bit misaligned. See for example Norway, Canada, Peru or Chile. If you know a simple way to fix this, I would be glad to hear from you!

# get centroids of countries
  centroids <- gCentroid( wmap , byid=TRUE, id = wmap@data$ISO3)
  centroids <- data.frame(centroids)
  setDT(centroids, keep.rownames = TRUE)[]
  setnames(centroids, "rn", "country_iso3c")

Now we just need to join the life expectancy data with the world map, and...

# join data to map
  wmap_df <- fortify(wmap, region = "ISO3")
  wmap_df <- left_join(wmap_df, dt, by = c('id'='country_iso3c'))        # data
  wmap_df <- left_join(wmap_df, centroids, by = c('id'='country_iso3c')) # centroids

3. Plot and save gif

# plot
  o <- ggplot(data=wmap_df) +
        geom_polygon(aes(x = long, y = lat, group = group, fill=life_expect, frame = year), color="gray90") +
        geom_text(aes(x = x, y = y, label = round(life_expect), frame = year), hjust=0, vjust=0, size = 4.5) +
        scale_fill_viridis(name="Life Expectancy", begin = 0, end = 1, limits = c(vmin,vmax), na.value="gray99") +
        theme_void() +
        guides(fill = guide_colorbar(title.position = "top")) +
        labs(title = "Life Expectancy, ") +
        labs(caption = "Map by Rafael H M Pereira, @UrbanDemog\nsource: UN World Population Prospects 2015 Revision") +
        theme(plot.title = element_text(hjust = 0.5, vjust = 0.05, size=25)) +
        theme(plot.caption = element_text(hjust = 0, color="gray40", size=15)) +
        coord_cartesian(xlim = c(-11807982, 14807978)) +
        theme( legend.position = c(.5, .08), 
               legend.direction = "horizontal", 
               legend.title.align = 0,
               legend.key.size = unit(1.3, "cm"),
               legend.title=element_text(size=17), 
               legend.text=element_text(size=13) )

# save gif
  gg_animate(o, "output4020_old.gif", title_frame =T, 
             ani.width=1600, ani.height=820, dpi=800, interval = .4)
             

As you can see, the plot is still a bit clunky with too many numbers, poor resolution, and legend too small. I will be updating this code in the future as I improve the map. If you have any suggestion to improve the map or the code, please feel free to get in touch!

enter image description here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment