Skip to content

Instantly share code, notes, and snippets.

@banditkings
Created October 14, 2022 16:54
Show Gist options
  • Save banditkings/9133dfdeb66213efc034e07be8206d63 to your computer and use it in GitHub Desktop.
Save banditkings/9133dfdeb66213efc034e07be8206d63 to your computer and use it in GitHub Desktop.
A basic example to pull in data from a tidytuesday repo, demonstrate DataFramesMeta syntax, and make a plot
using CSV, HTTP, DataFramesMeta, Plots, Dates
theme(:ggplot2)
plotlyjs()
file1 = "https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-05-10/nyt_titles.tsv"
df = CSV.read(HTTP.get(file1).body, DataFrame)
# who is the author with the most weeks on the best seller list?
top_authors = @chain df begin
groupby([:author])
@combine(:total_weeks=sum(:total_weeks))
sort(:total_weeks, rev=true)
first(5)
end
bar(top_authors.author, top_authors.total_weeks,
title="Top 5 Authors by # Weeks on Bestseller List")
top_authors_list = top_authors.author
# How did this change over time?
top_authors_over_time = @chain df begin
groupby([:year, :author])
@combine(:total_weeks = sum(:total_weeks))
@rsubset(:author ∈ top_authors_list)
@transform(:year = Date.(:year))
@orderby(:year)
end
auth = top_authors_list[1]
tempdf = top_authors_over_time[top_authors_over_time.author.==auth, :]
a = plot(tempdf.year, tempdf.total_weeks, label=auth)
for author in top_authors_list[2:5]
tempdf = top_authors_over_time[top_authors_over_time.author.==author, :]
a = plot!(tempdf.year, tempdf.total_weeks, label=author)
end
display(a)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment