Skip to content

Instantly share code, notes, and snippets.

View sboysel's full-sized avatar

Sam Boysel sboysel

View GitHub Profile
@cheerfulstoic
cheerfulstoic / gist:7e8ec61f9104017430af
Last active April 12, 2022 18:36
Examining what is possible for StackOverflow with a graph database

Analyzing StackOverflow with Neo4j and Clojure

Joining multiple disparate data-sources, commonly dubbed Master-Data-Management (MDM), is usually not a fun exercise. I would like to show you how using a graph database (Neo4j) and an interesting dataset (developer-oriented collaboration sites) to put the fun back into MDM. This approach will allow you to quickly and sensibly merge data from different sources into a consistent picture and query across the data efficiently to answer your most pressing questions.

You can read the associated blog posts on my blog. The blog posts cover the hows and whys of the project, while this and other GraphGists will examine how to answer specific questions of the data.

@jennybc
jennybc / 2014-10-12_stop-working-directory-insanity.md
Last active September 23, 2022 04:43
Stop the working directory insanity

There are packages for this now!

2017-08-03: Since I wrote this in 2014, the universe, specifically Kirill Müller (https://github.com/krlmlr), has provided better solutions to this problem. I now recommend that you use one of these two packages:

  • rprojroot: This is the main package with functions to help you express paths in a way that will "just work" when developing interactively in an RStudio Project and when you render your file.
  • here: A lightweight wrapper around rprojroot that anticipates the most likely scenario: you want to write paths relative to the top-level directory, defined as an RStudio project or Git repo. TRY THIS FIRST.

I love these packages so much I wrote an ode to here.

I use these packages now instead of what I describe below. I'll leave this gist up for historical interest. 😆

@hrbrmstr
hrbrmstr / themes.R
Last active April 16, 2018 19:48
various themes
theme_map <- function(base_size=9, base_family="") {
require(grid)
theme_bw(base_size=base_size, base_family=base_family) %+replace%
theme(
axis.line=element_blank(),
axis.text=element_blank(),
axis.ticks=element_blank(),
axis.title=element_blank(),
panel.background=element_blank(),
panel.border=element_blank(),
@Kartones
Kartones / postgres-cheatsheet.md
Last active April 12, 2024 17:44
PostgreSQL command line cheatsheet

PSQL

Magic words:

psql -U postgres

Some interesting flags (to see all, use -h or --help depending on your psql version):

  • -E: will describe the underlaying queries of the \ commands (cool for learning!)
  • -l: psql will list all databases and then exit (useful if the user you connect with doesn't has a default database, like at AWS RDS)
@leeper
leeper / gender.R
Last active October 11, 2017 18:37
Gender API Example https://gender-api.com/en/
#' Gets gender by name or email address, optionally by country or IP address.
library("httr")
library("XML")
#' @import httr
#' @import rjson
#' @param name A character string containing a first name, or a character vector containing first names. One must specify name or email.
#' @param email A character string containing an email address with a first name. One must specify name or email.
#' @param country An optional character string containing a two-letter country name, as listed here: https://gender-api.com/en/api-docs
@jbryer
jbryer / parse.codebook.r
Last active March 9, 2023 15:19
Parses a codebook file where lines starting at column zero (far left) represet variable information (e.g. name, description, type) and indented lines (i.e. lines beginning with white space, either tabs or spaces, etc.) represent factor levels and labels.
#' Parse a codebook file with variable and level information.
#'
#' Parses a codebook file where lines starting at column zero (far left) represet
#' variable information (e.g. name, description, type) and indented lines
#' (i.e. lines beginning with white space, either tabs or spaces, etc.) represent factor
#' levels and labels.
#'
#' Note that white space at the beginning and end of each line is stripped before
#' processing that line.
#'
@brendano
brendano / gist:39760
Created December 24, 2008 20:11
load the MNIST data set in R
# Load the MNIST digit recognition dataset into R
# http://yann.lecun.com/exdb/mnist/
# assume you have all 4 files and gunzip'd them
# creates train$n, train$x, train$y and test$n, test$x, test$y
# e.g. train$x is a 60000 x 784 matrix, each row is one digit (28x28)
# call: show_digit(train$x[5,]) to see a digit.
# brendan o'connor - gist.github.com/39760 - anyall.org
load_mnist <- function() {
load_image_file <- function(filename) {