Skip to content

Instantly share code, notes, and snippets.

@juliasilge
Last active March 23, 2022 23:34
Show Gist options
  • Save juliasilge/43279588a81384ea6f9e11f02c4961be to your computer and use it in GitHub Desktop.
Save juliasilge/43279588a81384ea6f9e11f02c4961be to your computer and use it in GitHub Desktop.
Area chart for #TidyTuesday baby name proportions
library(tidyverse)
library(silgelib)
theme_set(theme_plex())

babynames <- read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-03-22/babynames.csv')
#> Rows: 1924665 Columns: 5
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> chr (2): sex, name
#> dbl (3): year, n, prop
#> 
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

babynames %>%
    group_by(year, sex) %>%
    mutate(top_100 = row_number() <= 100) %>%
    ungroup() %>%
    count(year, sex, top_100, wt = n) %>%
    mutate(top_100 = if_else(top_100, "Top 100 names", "All other names"),
           sex = if_else(sex == "F", "Baby girls", "Baby boys")) %>%
    group_by(year, sex) %>%
    mutate(prop = n / sum(n)) %>%
    ungroup() %>%
    ggplot(aes(year, prop, fill = top_100)) +
    geom_area(alpha = 0.8) +
    facet_wrap(vars(sex)) +
    scale_fill_brewer(palette = "Paired") +
    scale_y_continuous(labels = scales::percent) +
    labs(x = NULL, y = NULL, fill = NULL,
         title = "What proportion of babies are given the most common names?",
         subtitle = "Proportion of babies given one of the top 100 names has been dropping, especially since about 1990") +
    theme(legend.position = "bottom") +
    guides(fill = guide_legend(reverse = TRUE))

Created on 2022-03-22 by the reprex package (v2.0.1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment