Skip to content

Instantly share code, notes, and snippets.


Shawn Graham shawngraham

View GitHub Profile
## Split audio files into chunks
## Daniel Pett 1/5/2020
__author__ = 'portableant'
## Tested on Python 2.7.16 - yes I know I need to upgrade.
import argparse
import os
import speech_recognition as sr
rccordell / renderSite.R
Last active Sep 8, 2020
This script builds on Aleszu Bajak's excellent [tutorial on building a course website using R Markdown and Github pages]( It automates the rendering of HTML files from RMD and automatically generates the page menu for the site, eliminating much duplicative work.
View renderSite.R
# This script builds on Aleszu Bajak's excellent
# [tutorial on building a course website using R Markdown and Github pages](
# I was excited about the concept but wanted to automate a few of the production steps: namely generating the HTML files
# for the site from the RMD pages (which Aleszu describes doing one-by-one) and generating the site navigation menu,
# which Aleszu handcodes in the `_site.yml` file. This script should automate both processes, though it may have some quirks
# unique to my setup that you'd want to tweak to fit your own. It's likely more loquacious than necessary as well, so feel free
# to condense as you can. Ideally, each time you make updates to your RMD files you can run this script to generate updated HTML
# pages and a new `_site.yml`. Then commit changes to Github and you're up and running!
# Once you've got everything configured for your own site below, you should be able to run `source('rend
View PoetryBot.rmd
title: "Programming Literary Bots"
author: "Ryan Cordell"
date: "3/12/2017"
output: html_document
## Acknowledgements
This version of my twitterbot assignment was adapted from [an original written in Python](, which itself adapted code written by Mark Sample. That orginal bot tweeted (I've since stopped it) at [Quoth the Ravbot]( The current version owes much to advice and code borrowed from two colleagues at Northeastern University: Jonathan Fitzgerald and Benjamin Schmidt.
from __future__ import absolute_import, division, print_function
This is a modification of the
script in Tensorflow. The original script produces
string labels for input images (e.g. you input a picture
of a cat and the script returns the string "cat"); this
modification reads in a directory of images and
generates a vector representation of the image using
drjwbaker /
Last active Aug 31, 2016
Getting Pastec up and running, 8 August 2016

Getting Pastec up and running

Pastec is an open source index and search engine for image recognition. This is how I got it working with lots of help from the hard work of Ryan Baumann, Shawn Graham and Matthew Lincoln.


Either install Ubuntu 14.04.5 as an operating system, or get a virtual machine from osboxes. Fire up with VirtualBox. Ensure VM is connected to the network (Settings>Network).

Install Pastec by following the documentation. Be sure to download and unzip visualWordsORB.dat into the build subdirectory of Pastec.

karpathy /
Last active Oct 7, 2022
Minimal character-level language model with a Vanilla Recurrent Neural Network, in Python/numpy
Minimal character-level Vanilla RNN model. Written by Andrej Karpathy (@karpathy)
BSD License
import numpy as np
# data I/O
data = open('input.txt', 'r').read() # should be simple plain text file
chars = list(set(data))
data_size, vocab_size = len(data), len(chars)
benmarwick / tweet-edits-to-archaeology-articles.R
Last active Jun 14, 2020
Using R with wikipedia for various things
View tweet-edits-to-archaeology-articles.R
# get recent changes from wikipedia
n_changes <- 5000
recent_changes_url <- paste0("", n_changes , "&days=1")
# connect to website
html <- read_html(recent_changes_url)
cdiener /
Created Apr 13, 2014 now with documentation
# This line imports the modules we will need. The first is the sys module used
# to read the command line arguments. Second the Python Imaging Library to read
# the image and third numpy, a linear algebra/vector/matrix module.
import sys; from PIL import Image; import numpy as np
# This is a list of characters from low to high "blackness" in order to map the
# intensities of the image to ascii characters
chars = np.asarray(list(' .,:;irsXA253hMHGS#9B&@'))
# Check whether all necessary command line arguments were given, if not exit and show a
benmarwick / HTML2DTM.r
Created Feb 22, 2013
Take a folder of HTML files and convert them to a document term matrix for text mining. Includes removal of non-ASCII characters and iterative removal of stopwords
# get data
setwd("C:/Downloads/html") # this folder has only the HTML files
html <- list.files()
# load packages
# get some code from github to convert HTML to text
writeChar(con="htmlToText.R", (getURL(ssl.verifypeer = FALSE, "")))
benmarwick / R2MALLET.r
Last active Apr 12, 2021
R code to operate MALLET entirely from within R. Set variables, send commands to Windows' command console and get MALLET's result back into R for further analysis.
# Set working directory
dir <- "C:\\" # adjust to suit
# configure variables and filenames for MALLET
## here using MALLET's built-in example data and
## variables from
# folder containing txt files for MALLET to work on
importdir <- "C:\\mallet-2.0.7\\sample-data\\web\\en"