Skip to content

Instantly share code, notes, and snippets.

Tom McTavish tommct

Block or report user

Report or block tommct

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
@tommct
tommct / README.md
Created Aug 28, 2018
MongoDB from Tableau
View README.md

To get use MongoDB from Tableau, start a mongosqld instance...

mongosqld --mongo-uri "mongodb://<host>:<port>/?connect=direct"

Then from Tableau, select Servers->MongoDB BI Connector with 127.0.0.1 and 3307 as connection details.

@tommct
tommct / README.md
Last active Jan 10, 2018
Matplotlib normalized histograms
View README.md

This creates a normalized mass density histogram in matplotlib

bins = np.linspace(-1, 1, 101)
# To get a normalized mass density histogram, we have to do it this way...
hist, bins = np.histogram(df['some_column'], bins=bins, density=True)
hist /= len(bins)
width = bins[1]-bins[0]
fig = plt.figure(figsize=(8, 4))
ax = fig.add_axes([.15, .15, .75, .75])
plt.bar(left=bins[:-1], height=hist, width=width)
@tommct
tommct / README.md
Last active Jun 1, 2019
Instructions for downloading Jupyter Notebooks from Coursera
View README.md

From an open Jupyter Notebook homework assignment, select "Coursera" to take you to the home page. Make a new notebook and fill it with the following and excute the cell with:

%%bash
tar cvfz hw.tar.gz .

This may take a little while to run depending on the packages. Select "Coursera" again to take you to the Home directory. Check the hw.tar.gz file and then Download. After the file is downloaded, delete it.

@tommct
tommct / columnviamerge.py
Created Jul 20, 2017
Add columns to Pandas DataFrame by (left) merging with another.
View columnviamerge.py
def columns_via_merge(df: pd.DataFrame, df2: pd.DataFrame, oncols: list, assigning: list):
"""
Add (or replace) columns to df that map via a merge with df2.
Examples:
# Add the ord value to a subset of a DataFrame
ABC = [chr(x) for x in range(ord('A'), ord('Z') + 1)]
AABBCC = [chr(x)+chr(x) for x in range(ord('A'), ord('Z') + 1)]
abc = [chr(x) for x in range(ord('a'), ord('z') + 1)]
@tommct
tommct / README.md
Last active Aug 24, 2016
Tableau Box Plots and Histograms
View README.md

This is a recipe for making box plots overlaying histograms in Tableau version 9.3. It largely borrows from http://vizpainter.com/some-tableau-tips-options-for-box-and-whisker/ and http://vizdiff.blogspot.com/2015/11/overlaying-histogram-with-box-and.html.

  1. Create a fixed continuous variable for number of objects per dimension. For example, the number of unique assignments per user:

     [Assignments Per User] = {FIXED [Userid] : COUNTD([Assignmentid])}
    
  2. Set the variable's Default Aggregation to COUNT.

  3. Drag the variable from Measures to the columns shelf.

  4. Set it to "Dimension" instead of CNT().

@tommct
tommct / README.md
Last active Sep 11, 2016
Agglomerative Filtering Recipe for Python Sklearn using similarity matrix
View README.md

This is a recipe for using Sklearn to build a cosine similarity matrix and then to build dendrograms from it.

import numpy as np
import matplotlib.pyplot as plt
import scipy.cluster.hierarchy
import scipy.spatial.distance
from scipy.spatial.distance import pdist
from sklearn.metrics.pairwise import cosine_similarity

# Make a "feature matrix" of 15 items that will be the binary representation of each index.
@tommct
tommct / README.md
Created Nov 6, 2015
Change Modification Date via Python
View README.md

This is Python code for updating the file modification date of a file on MacOSX or Linux. In this example, I had copied .dv files from my camcorder, which encoded the date in the filename, but had as the modification date, the time I transferred the file from the camcorder.

import os
import time
fpath = '/path/to/dv/files'
for root, dirs, files in os.walk(fpath):
    for name in files:
        if name[-3:]=='.dv':
@tommct
tommct / README.md
Last active Aug 29, 2015
D3 Stacked Brush Plots
View README.md

Implements multiple, stacked plots with brushing. This extends the example at http://bl.ocks.org/mbostock/1667367 and allows for multiple panels where each subsequent panel zooms from the previous. Data points are also smoothed, permitting data with over 100,000 points to have an overview with subsequent telescoping while maintaining context.

@tommct
tommct / README.md
Last active Jan 1, 2016
D3 Hierarchical Ordinal Ticks
View README.md

This D3 example demonstrates constrained zooming, much like http://bl.ocks.org/tommct/5671250, but also illustrates the use of hierarchical ordinal tick marks. It does this by using the normalized values that one gets when using a hierarchical partition layout.

@tommct
tommct / README.md
Last active Jan 1, 2016
D3 Canvas ImageData w/ Constrained Zooming and Marginal Distributions
View README.md
You can’t perform that action at this time.