Skip to content

Instantly share code, notes, and snippets.

Avatar

Sam Boysel sboysel

View GitHub Profile
@Kartones
Kartones / postgres-cheatsheet.md
Last active Dec 3, 2021
PostgreSQL command line cheatsheet
View postgres-cheatsheet.md

PSQL

Magic words:

psql -U postgres

Some interesting flags (to see all, use -h or --help depending on your psql version):

  • -E: will describe the underlaying queries of the \ commands (cool for learning!)
  • -l: psql will list all databases and then exit (useful if the user you connect with doesn't has a default database, like at AWS RDS)
@DaniSancas
DaniSancas / neo4j_cypher_cheatsheet.md
Created Jun 14, 2016
Neo4j's Cypher queries cheatsheet
View neo4j_cypher_cheatsheet.md

Neo4j Tutorial

Fundamentals

Store any kind of data using the following graph concepts:

  • Node: Graph data records
  • Relationship: Connect nodes (has direction and a type)
  • Property: Stores data in key-value pair in nodes and relationships
  • Label: Groups nodes and relationships (optional)
@gbaman
gbaman / graphql_example.py
Created Nov 1, 2017
An example on using the Github GraphQL API with Python 3
View graphql_example.py
# An example to get the remaining rate limit using the Github GraphQL API.
import requests
headers = {"Authorization": "Bearer YOUR API KEY"}
def run_query(query): # A simple function to use requests.post to make the API call. Note the json= section.
request = requests.post('https://api.github.com/graphql', json={'query': query}, headers=headers)
if request.status_code == 200:
@brendano
brendano / gist:39760
Created Dec 24, 2008
load the MNIST data set in R
View gist:39760
# Load the MNIST digit recognition dataset into R
# http://yann.lecun.com/exdb/mnist/
# assume you have all 4 files and gunzip'd them
# creates train$n, train$x, train$y and test$n, test$x, test$y
# e.g. train$x is a 60000 x 784 matrix, each row is one digit (28x28)
# call: show_digit(train$x[5,]) to see a digit.
# brendan o'connor - gist.github.com/39760 - anyall.org
load_mnist <- function() {
load_image_file <- function(filename) {
@jexp
jexp / bulk-neo4j-import-original.sh
Last active May 10, 2021
Panama Papers Import Scripts for Neo4j
View bulk-neo4j-import-original.sh
export NEO4J_HOME=${NEO4J_HOME-~/Downloads/neo4j-community-3.0.1}
if [ ! -f data-csv.zip ]; then
curl -OL https://cloudfront-files-1.publicintegrity.org/offshoreleaks/data-csv.zip
fi
export DATA=${PWD}/import
rm -rf $DATA
@jennybc
jennybc / 2014-10-12_stop-working-directory-insanity.md
Last active May 8, 2021
Stop the working directory insanity
View 2014-10-12_stop-working-directory-insanity.md

There are packages for this now!

2017-08-03: Since I wrote this in 2014, the universe, specifically Kirill Müller (https://github.com/krlmlr), has provided better solutions to this problem. I now recommend that you use one of these two packages:

  • rprojroot: This is the main package with functions to help you express paths in a way that will "just work" when developing interactively in an RStudio Project and when you render your file.
  • here: A lightweight wrapper around rprojroot that anticipates the most likely scenario: you want to write paths relative to the top-level directory, defined as an RStudio project or Git repo. TRY THIS FIRST.

I love these packages so much I wrote an ode to here.

I use these packages now instead of what I describe below. I'll leave this gist up for historical interest. 😆

@dannguyen
dannguyen / csvkit-sql-cli-readme.md
Last active May 8, 2020
Using bash, csvkit, and SQLite to analyze San Francisco restaurant health inspection data
View csvkit-sql-cli-readme.md

How to download, import, and analyze San Francisco restaurant inspection data using SQLite3 and csvkit from the command-line.

A quick example of doing data wrangling from the command-line, as well as getting to know one of San Francisco's data sets: the San Francisco restaurant inspections, courtesy of the SF Department of Public Health. I don't normally do database work from the command-line, but importing bulk data into SQLite is pretty frustrating using the available GUIs or just the shell.

So thank goodness for Christopher Groskopf's csvkit, a suite of Unix-like tools that use Python to robustly handle CSV files. There's a lot of great tools in csvkit, but for this gist, I just use csvsql, which can parse a CSV and turn it into properly-flavored SQL to pass directly into your database app of choice.

@cheerfulstoic
cheerfulstoic / gist:7e8ec61f9104017430af
Last active Oct 20, 2019
Examining what is possible for StackOverflow with a graph database
View gist:7e8ec61f9104017430af

Analyzing StackOverflow with Neo4j and Clojure

Joining multiple disparate data-sources, commonly dubbed Master-Data-Management (MDM), is usually not a fun exercise. I would like to show you how using a graph database (Neo4j) and an interesting dataset (developer-oriented collaboration sites) to put the fun back into MDM. This approach will allow you to quickly and sensibly merge data from different sources into a consistent picture and query across the data efficiently to answer your most pressing questions.

You can read the associated blog posts on my blog. The blog posts cover the hows and whys of the project, while this and other GraphGists will examine how to answer specific questions of the data.

@etcwilde
etcwilde / ght-restore-psql
Last active Sep 3, 2019
Restore the GHTorrent database to postgres instead of mysql (based on mysql-2017-01-19 image)
View ght-restore-psql
#!/usr/bin/env bash
# Evan Wilde <etcwilde@uvic.ca>
# July 20, 2017
# defaults
user="postgres"
passwd=""
host="localhost"
db="ghtorrent"
tmpdir='/tmp'