Sharon Howard sharonhoward

## llm-hackathon-submissions.md

      
              1 file
            
          
              0 forks
            
          
                3 comments
              
            
              8 stars
            
          
                chriscarrollsmith
                / llm-hackathon-submissions.md
            
            
              Last active
              July 18, 2025 08:29
            
              
                Writeup of submissions to the Coders' Colaboratory `llm` hackathon in Latham, New York
              
          
    Projects

Runner-Up: Doctor of Credit


Creator: Evan Mullen
Description: Extract structured data about bank deposit bonuses from the Doctor of Credit website.

Prerequisites

Google Chrome CLI entrypoint


## prog-hist-word-freq.py
#html-to-freq.py

# original lesson: https://programminghistorian.org/en/lessons/counting-frequencies

# We've added json to the list of imports — you don't need to install anything,
# json is part of the Python standard library.
import urllib.request, urllib.error, urllib.parse, json, obo

# Notice that instead of requesting the HTML directly, we're now
# making a request to the backend API — meaning a server that returns

## httr2-bsky-atproto.R
# ------------------------------------------------------------------------------------------
# Basically all translated from the Python example at https://atproto.com/blog/create-post
# ------------------------------------------------------------------------------------------

library(httr2)

# Create a logged-in API session object
session <- request("https://bsky.social/xrpc/com.atproto.server.createSession") |>
  req_method("POST") |>
  req_body_json(list(

## xkcd-oops-all-positives.md

      
              1 file
            
          
              0 forks
            
          
                1 comment
              
            
              1 star
            
          
                andrewheiss
                / xkcd-oops-all-positives.md
            
            
              Created
              August 1, 2023 15:01
            
          
    Original xkcd
See Georgios Karamanis's example of doing this by converting the map into a raster object
library(tidyverse)
library(sf)
library(rnaturalearth)

# Map data

  
## little_women.R
library(tidyverse)
library(tidytext)
library(gutenbergr)
library(cleanNLP)

little_women_raw <- gutenberg_download(514, meta_fields = "title")

little_women <- little_women_raw %>%
  slice(70:n()) %>%
  mutate(chapter_start = str_detect(text, "^CHAPTER"),

## switching-from-XML-to-xml2.md

      
              1 file
            
          
              0 forks
            
          
                4 comments
              
            
              16 stars
            
          
                nuest
                / switching-from-XML-to-xml2.md
            
            
              Last active
              May 10, 2025 09:37
            
              
                Switching an R package from XML to xml2
              
          
    Switching from XML to xml2

Rationale

The R package XML for parsing and manipulation of XML documents in R is not actively maintained anymore, but used by many:
The R package xml2 is an actively maintained, more recent alternative.
This file documents useful resources and steps for moving from XML to xml2.

  
## sparkbar.R
# Takes an ordered vector of numeric values and returns a small bar chart made
# out of Unicode block elements. Works well inside dplyr mutate() or summarise()
# calls on grouped data frames.

sparkbar <- function(values) {
  span <- max(values) - min(values)
  if(span > 0 & !is.na(span)) {
    steps <- round(values / (span /  7))
    blocks <- c('▁', '▂', '▃', '▄', '▅', '▆', '▇', '█')
    paste(sapply(steps - (min(steps) - 1), function(i) blocks[i]), collapse = '')

## paperspast_all.py
import requests

number = 1

# Note, this will take a long time!
while number < 1318493:

    # Make sure to change your API key at the end of the URL
    urltext = "http://api.digitalnz.org/v3/records.xml?api_key=################&and[collection][]=Papers+Past&sort=date&text=+&and[category][]=Newspapers&direction=asc&page=" + str(number)


## Filtering Nodes
Click to view more!

## stratify.r
# Uses a subset of the Iris data set with different proportions of the Species factor
set.seed(42)
iris_subset <- iris[c(1:50, 51:80, 101:120), ]

stratified_sample <- iris_subset %>%
  group_by(Species) %>%
  mutate(num_rows=n()) %>%
  sample_frac(0.4, weight=num_rows) %>%
  ungroup
	#html-to-freq.py

	# original lesson: https://programminghistorian.org/en/lessons/counting-frequencies

	# We've added json to the list of imports — you don't need to install anything,
	# json is part of the Python standard library.
	import urllib.request, urllib.error, urllib.parse, json, obo

	# Notice that instead of requesting the HTML directly, we're now
	# making a request to the backend API — meaning a server that returns
	# ------------------------------------------------------------------------------------------
	# Basically all translated from the Python example at https://atproto.com/blog/create-post
	# ------------------------------------------------------------------------------------------

	library(httr2)

	# Create a logged-in API session object
	session <- request("https://bsky.social/xrpc/com.atproto.server.createSession") \|>
	req_method("POST") \|>
	req_body_json(list(
	library(tidyverse)
	library(tidytext)
	library(gutenbergr)
	library(cleanNLP)

	little_women_raw <- gutenberg_download(514, meta_fields = "title")

	little_women <- little_women_raw %>%
	slice(70:n()) %>%
	mutate(chapter_start = str_detect(text, "^CHAPTER"),
	# Takes an ordered vector of numeric values and returns a small bar chart made
	# out of Unicode block elements. Works well inside dplyr mutate() or summarise()
	# calls on grouped data frames.

	sparkbar <- function(values) {
	span <- max(values) - min(values)
	if(span > 0 & !is.na(span)) {
	steps <- round(values / (span / 7))
	blocks <- c('▁', '▂', '▃', '▄', '▅', '▆', '▇', '█')
	paste(sapply(steps - (min(steps) - 1), function(i) blocks[i]), collapse = '')
	import requests

	number = 1

	# Note, this will take a long time!
	while number < 1318493:

	# Make sure to change your API key at the end of the URL
	urltext = "http://api.digitalnz.org/v3/records.xml?api_key=################&and[collection][]=Papers+Past&sort=date&text=+&and[category][]=Newspapers&direction=asc&page=" + str(number)
	# Uses a subset of the Iris data set with different proportions of the Species factor
	set.seed(42)
	iris_subset <- iris[c(1:50, 51:80, 101:120), ]

	stratified_sample <- iris_subset %>%
	group_by(Species) %>%
	mutate(num_rows=n()) %>%
	sample_frac(0.4, weight=num_rows) %>%
	ungroup