Skip to content

Instantly share code, notes, and snippets.

View tommct's full-sized avatar

Tom McTavish tommct

View GitHub Profile
@tommct
tommct / README.md
Last active January 9, 2022 09:02
Instructions for downloading Jupyter Notebooks from Coursera

From an open Jupyter Notebook homework assignment, select "Coursera" to take you to the home page. Make a new notebook and fill it with the following and excute the cell with:

%%bash
tar cvfz hw.tar.gz .

This may take a little while to run depending on the packages. Select "Coursera" again to take you to the Home directory. Check the hw.tar.gz file and then Download. After the file is downloaded, delete it.

@tommct
tommct / columnviamerge.py
Created July 20, 2017 18:23
Add columns to Pandas DataFrame by (left) merging with another.
def columns_via_merge(df: pd.DataFrame, df2: pd.DataFrame, oncols: list, assigning: list):
"""
Add (or replace) columns to df that map via a merge with df2.
Examples:
# Add the ord value to a subset of a DataFrame
ABC = [chr(x) for x in range(ord('A'), ord('Z') + 1)]
AABBCC = [chr(x)+chr(x) for x in range(ord('A'), ord('Z') + 1)]
abc = [chr(x) for x in range(ord('a'), ord('z') + 1)]
@tommct
tommct / README.md
Last active March 15, 2021 17:39
Tableau Box Plots and Histograms

This is a recipe for making box plots overlaying histograms in Tableau version 9.3. It largely borrows from http://vizpainter.com/some-tableau-tips-options-for-box-and-whisker/ and http://vizdiff.blogspot.com/2015/11/overlaying-histogram-with-box-and.html.

  1. Create a fixed continuous variable for number of objects per dimension. For example, the number of unique assignments per user:

     [Assignments Per User] = {FIXED [Userid] : COUNTD([Assignmentid])}
    
  2. Set the variable's Default Aggregation to COUNT.

  3. Drag the variable from Measures to the columns shelf.

  4. Set it to "Dimension" instead of CNT().

@tommct
tommct / README.md
Last active September 11, 2016 17:55
Agglomerative Filtering Recipe for Python Sklearn using similarity matrix

This is a recipe for using Sklearn to build a cosine similarity matrix and then to build dendrograms from it.

import numpy as np
import matplotlib.pyplot as plt
import scipy.cluster.hierarchy
import scipy.spatial.distance
from scipy.spatial.distance import pdist
from sklearn.metrics.pairwise import cosine_similarity

Make a "feature matrix" of 15 items that will be the binary representation of each index.

@tommct
tommct / README.md
Created November 6, 2015 00:17
Change Modification Date via Python

This is Python code for updating the file modification date of a file on MacOSX or Linux. In this example, I had copied .dv files from my camcorder, which encoded the date in the filename, but had as the modification date, the time I transferred the file from the camcorder.

import os
import time
fpath = '/path/to/dv/files'
for root, dirs, files in os.walk(fpath):
    for name in files:

if name[-3:]=='.dv':

@tommct
tommct / README.md
Last active August 29, 2015 14:11
D3 Stacked Brush Plots

Implements multiple, stacked plots with brushing. This extends the example at http://bl.ocks.org/mbostock/1667367 and allows for multiple panels where each subsequent panel zooms from the previous. Data points are also smoothed, permitting data with over 100,000 points to have an overview with subsequent telescoping while maintaining context.

@tommct
tommct / README.md
Last active January 1, 2016 19:38
D3 Hierarchical Ordinal Ticks

This D3 example demonstrates constrained zooming, much like http://bl.ocks.org/tommct/5671250, but also illustrates the use of hierarchical ordinal tick marks. It does this by using the normalized values that one gets when using a hierarchical partition layout.

@tommct
tommct / README.md
Last active January 1, 2016 08:09
D3 Canvas ImageData w/ Constrained Zooming and Marginal Distributions
@tommct
tommct / README.md
Last active December 31, 2015 21:49
D3 Zoomable ImageData
@tommct
tommct / README.md
Last active June 11, 2018 07:24
D3 Constrained Zoom Canvas Image

Implements constrained zooming of an image put onto an HTML5 Canvas.