Skip to content

Instantly share code, notes, and snippets.

View bheni's full-sized avatar

Brian Hendriks bheni

View GitHub Profile

Create and initialize a new dolt repository for our demo and then start the sql shell where we will do everything.

~/datasets>mkdir demo
~/datasets>cd demo
~/datasets/demo>dolt init
Successfully initialized dolt data repository.
~/datasets/demo>dolt sql
# Welcome to the DoltSQL shell.
# Statements must be terminated with ';'.
# Init a new repo
dolt init
# Import 2011 data
dolt checkout -b 2011
dolt schema import column_mappings 2011_mapping.csv
dolt table import -r column_mappings 2011_mapping.csv
dolt add column_mappings
dolt commit -m "import 2011 column mappings"
# csv_to_map.py converts csvs containing states and percentanges and converts them to a format that can be imported
# into https://mapchart.net/usa.html
import sys
import json
import os.path
# constants
BRANCHES = [str(branch) for branch in range(2011, 2018)]
# calc_half_agi_percentages.py is a simple script that uses doltpy to retrieve data from the Dolt irs-soi dataset
# and calculates an approximate percentage of tax returns that make up half of the AGI for each state, for the tax
# years 2011 through 2017 and outputs a csv for each year.
#
# setup:
# Install Dolt. Dolt documentation and installation instructions are at https://github.com/liquidata-inc/dolt
# Clone the irs-soi dataset from dolt by running "dolt clone Liquidata/irs-soi" from the directory where you want the
# data to be cloned to.
# Install doltpy. doltpy documentation and installation instructions are at https://github.com/liquidata-inc/doltpy
#
@bheni
bheni / README.md
Created August 29, 2019 17:50
Dolt Releases README.md

Installation

  • Download the archive that is appropriate for your operating system and computer architecture
    • 64-bit mac -> dolt-darwin-amd64.tar.gz
    • 32-bit mac -> dolt-darwin-386.tar.gz
    • 64-bit linux -> dolt-linux-amd64.tar.gz
    • 32-bit linux -> dolt-linux-386.tar.gz
    • 64-bit windows -> dolt-windows-amd64.tar.gz
    • 32-bit windows -> dolt-windows-386.tar.gz
  • Extract the archive and copy the files in bin/ into a directory that is in your path.
import mysql.connector
import pandas
import os
import time
from subprocess import *
class DoltException(Exception):
def __init__(self, exec_args, stdout, stderr, exitcode):
@bheni
bheni / python-dolt-test.py
Created August 9, 2019 19:21
This is a sample script showing how you can access dolt data via python
# This is a sample python script which reads from dolt and prints the output.
#
# requirements: Install the python mysql connector
# python -m pip install mysql-connector-python
#
# usage: python python-dolt-test.py <dolt directory> <query>
#
# <dolt directory> is the directory of an existing dolt repository where dolt sql-server
# will be started. dolt sql-server will use the current branch.
#