Skip to content

Instantly share code, notes, and snippets.

View gist:be0a98163192d320271ab8655c1000ad
library(rvest)
library(tidyverse)
library(lubridate)
# https://bringatrailer.com/electric-vehicles/?search=rivian
# from 11 Jan 2023
url <- "~/Downloads/bat.html"
View archy-208-module-7-bmarwick.qmd
---
title: "Untitled"
format: html
editor: visual
---
## Introduction
The aim of this report ...
View gist:da2d99ea4dfef523e08d57f4cfeac16b
# This script will take a folder fill of qmd files and try to render each qmd file.
# If one qmd fails to render, the script will continue on to the next one. The
# results of all the attempts to render are collected in a data frame so we can
# easily inspect and find the files that failed to render We don't have to
# manually render individual files.
# How to use
# 1. Download all submissions from Canvas (go to assignment page, look for the
# download button near the speedgrader button), when it arrives on your
@benmarwick
benmarwick / archaepaperswithcodeinsept2021.R
Last active Sep 27, 2022
How many archaeology papers with R as of Sept 2021?
View archaepaperswithcodeinsept2021.R
# How many articles on the list in Sept 2021?
# First, run some lines from archaepaperswithcode.R to create repo,
# then:
## Coerce commits to a data.frame
df <- as.data.frame(repo)
# filter rows of commits from Sept 2021
@benmarwick
benmarwick / scraping-pnas-titles.R
Last active Oct 6, 2022
scrape PNAS archaeology articles and basic text analysis of abstracts
View scraping-pnas-titles.R
# JB says she searched for "archaeolog*" and "archeolog*", these return 257 results for me,
# much less than the 1002 we get from searching the archaeology 'keyword', e.g.
# https://www.pnas.org/action/doSearch?Concept=500376&Concept=500375&startPage=0&sortBy=Earliest
# in any case, let's start with "archaeolog*" and "archeolog*", I've copied the URL of the search
# results page and edited the URL to return 500 items on the first page, so we can get all results without
# having to scrape multiple pages of results, just to simplify the process
library(tidyverse)
@benmarwick
benmarwick / gist:57d1c1ba265a2e5ab6c5f33b729b8fdd
Last active May 10, 2022
Analyse text reuse using minhash and locality-sensitive hashing (LSH)
View gist:57d1c1ba265a2e5ab6c5f33b729b8fdd
library(tidyverse)
library(textreuse)
# one row per student, to get the data, go to canvas -> quiz -> 'quiz stats' -> 'student analysis'
cnvs <- read_csv("quiz-responses-downloaded-from-canvas.csv")
# select only the column with the text we want to compare
cnvs_q5 <-
cnvs %>%
select( q5 = contains("peers' reflections"))
@benmarwick
benmarwick / tektite-maps-sea-vietnam.R
Created Oct 16, 2021
Tektite maps for southeast Asia & Vietnam
View tektite-maps-sea-vietnam.R
library(tidyverse)
library(sf)
library(googlesheets4)
# get our data from google sheets, we need to:
# - 'publish to web'
# - adjust sharing settings to share with anyone
# I've done these, now, so it should just work:
my_key <- "1xUqRGnb9kwBi128cERiHkV5w4uREwl-mqAVf0jmQhhg"
View plotting-archaeology-papers-with-R-code.R
# also at https://gist.github.com/benmarwick/f11ae49ab9afde0071b133012ff76cbc
ctv <- "https://raw.githubusercontent.com/benmarwick/ctv-archaeology/master/README.md"
library(tidyverse)
library(glue)
archy_ctv_readme <- readLines(ctv)
# get just the articles
@benmarwick
benmarwick / viralarchive.Rmd
Created Jun 1, 2021
Object recognition in Images in #viralarchive tweets
View viralarchive.Rmd
I used the Python library GetOldTweets3 to get the tweets because the rtweet package cannot get tweets older than 6-9 days. Details about this Python library are here: https://github.com/Mottl/GetOldTweets3
I used this line in the shell to get tweets using the #viralarchive hashtag:
```{bash, engine.opts="-l", eval = F}
GetOldTweets3 --querysearch 'viralarchive' --maxtweets 10000
```
@benmarwick
benmarwick / ggplot-to-jpg-set-dpi.R
Created Dec 19, 2020
How to save a ggplot as a JPG file with specific dimensions and a high dpi
View ggplot-to-jpg-set-dpi.R
# How to save a ggplot as a JPG file with a
# specific dpi and dimensions, for example,
# because a publisher requires it
library(ggplot2)
p <-
ggplot(mtcars) +
aes(mpg,
disp) +