Skip to content

Instantly share code, notes, and snippets.

@gpfreitas
gpfreitas / pig_nan_null_inf.sh
Last active August 29, 2015 14:10
Test operations with NULL, NaN and Infinity in Pig
#!/usr/bin/env sh
USAGE='NAME
pig_nan_null_inf.sh -- Test operations with NULL, NaN and Infinity
SYNOPSIS
./pig_nan_null_inf.sh
DESCRIPTION
@gpfreitas
gpfreitas / hist.awk
Last active November 24, 2015 17:18
Histogram for integer x and y values
# hist.awk - Histogram for integer x and y values
#
# This AWK program takes as input a sequence of x, y integer values, one per
# row, where x is supposed to be the bin, and y is the count of values in that
# bin. In other words, this sequence already encodes the histogram (think of
# the output of uniq -c), so this script only pretty prints that histogram to
# the screen. Furthermore, we assume that the input rows are sorted by the bin
# values (the first column) and that the counts in the second column are always
# nonnegative.
#
@gpfreitas
gpfreitas / unix_ref.rst
Created November 28, 2015 00:03
Resources for someone who wants to learn to use UNIX-like systems from the command-line, with some focus on data analysis

Basic Tools for Data Analysis in a UNIX Environment

Author

Guilherme Freitas <guilherme@gpfreitas.net>

Contents

Overview and Definitions

"""
This script takes as input a list of Python source files and outputs the
top-level modules that are imported in those source files.
The script does this without executing any code. This is useful when you have
exercise code (that often has syntax errors / missing code) or if you want to
avoid any harmful side-effects of executing untrusted code.
"""
import argparse
@gpfreitas
gpfreitas / pytest.md
Created June 3, 2018 17:42 — forked from kwmiebach/pytest.md
pytest cheat sheet

Usage

(Remember to create a symlink pytest for py.test)

pytest [options] [file_or_dir] [file_or_dir] ...

Help:

import numpy as np
import pandas as pd
import itertools as it
import matplotlib.pyplot as plt
import logging
log_fmt = '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
logging.basicConfig(level=logging.INFO, format=log_fmt)
logging.info('BEGIN')
@gpfreitas
gpfreitas / Makefile
Last active February 7, 2023 21:57
Makefile to set up for dask tutorial given at realtor.com
# usage: place this file in a folder where you want to your dask tutorial work and run `make`
#
# if you want to a root dir other than ROOT_DIR, then copy this file there and run `make all`
ROOT_DIR = $(HOME)/reading_group_dask_tutorial
all: dask-tutorial conda_env
@echo ""
@echo "To run the dask tutorial now run"
# Am I in the right branch? Did I push my latest commit?
git log

# Who has worked on this project/folder?
git log -- .  # Then search "author" with the pager

# What have I worked on recently? Or, What branches have I authored?
git log --branches --no-walk --author=Guilherme
@gpfreitas
gpfreitas / sqltree.py
Created June 23, 2023 20:58 — forked from gocha/sqltree.py
Print parsed SQL statement as a tree (sqlparse with Python 3)
"""Print parsed SQL statement as a tree.
Uses Python 3 and sqlparse package.
"""
from typing import Iterator, Tuple
import argparse
import locale
import sys
import os