Skip to content

Instantly share code, notes, and snippets.

@mcfrank
Created April 30, 2024 16:18
Show Gist options
  • Save mcfrank/a067db5fd92af877f6f35f675fea8882 to your computer and use it in GitHub Desktop.
Save mcfrank/a067db5fd92af877f6f35f675fea8882 to your computer and use it in GitHub Desktop.
Figure from Frank (2024) "Bridging the Data Gap"
library(xkcd)
library(tidyverse)
d <- tibble(age = 1:20) |>
mutate(upper = 1e6 * age * 12 + ifelse(age > 5, 2.5e5 * 52 * (age - 5), 0),
lower = 1e5 * age * 12) |>
pivot_longer(upper:lower, names_to = "bound", values_to = "vocabulary")
pdf("~/Projects/AI commentaries/scale.pdf", width = 5, height = 4)
ggplot(d, aes(x = age, y = vocabulary, col = bound)) +
annotate(geom = "text", y = 9e8, x = 19.5, label = "human upper bound",
hjust = 1, col = "#dc322f") +
annotate(geom = "text", y = 5e7, x = 19.5, label = "human lower bound",
hjust = 1, col = "#268bd2") +
geom_point() +
geom_line() +
geom_hline(yintercept = 5e11 * .75, lty = 2, col = "darkgray") +
geom_hline(yintercept = 1e12 * .75, lty = 2, col = "black") +
annotate(geom = "text", y = 1.4e12, x = 1, label = "Chinchilla", hjust = 0) +
annotate(geom = "text", y = 2e11, x = 1, label = "GPT-3", hjust = 0, col = "darkgray") +
geom_vline(xintercept = 5, lty = 3) +
geom_vline(xintercept = 20, lty = 3) +
scale_y_log10(breaks = c(1e6, 1e9, 1e12),
minor_breaks = c(1e5, 1e6, 1e7, 1e8, 1e9, 1e10, 1e11, 1e12),
limits = c(1e5, 1e13))+
ggthemes::theme_few() +
ggthemes::scale_color_solarized(guide = "none") +
xlab("Age (years)") +
ylab("Words of language input")
dev.off()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment