Skip to content

Instantly share code, notes, and snippets.

I have a dataset but I would like to append more information onto it to make it more useful. In data warehousing this is similar to the concept of creating dimensions. With dimensional aka dim you can do this from the command line or in an iPython notebook.

Here's how it works.

pip install dimensional - this installs a cli called dim

@jpotts18
jpotts18 / app.py
Last active June 28, 2016 21:23
Nostradamus
from flask import Flask, request, abort, jsonify
from sklearn.externals import joblib
import numpy as np
app = Flask(__name__)
# Treat model like a backing resource that can be changed out with and ENV variable.
MODEL_FILE = 'model/iris_classifier.pkl'
CLASSES = ['setosa',
brew install rbenv ruby-build
echo 'export PATH=$HOME/.rbenv/bin:$PATH
echo 'eval "$(rbenv init -)"' >> .zshrc
source .zshrc
rbenv install -l
rbenv install x.x.x
rbenv rehash
class Dataset(object):
name = None
source_url = None
processing_notes = None # What modifications to the original data set were done. Outliers, Imputation, etc?
license = None # Something about how this can be used CC/Apache/etc.
columns = []
def __str__(self):
-- MySQL dump 10.9
--
-- Host: localhost Database: world
-- ------------------------------------------------------
-- Server version 4.1.13-log
/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES latin1 */;
@jpotts18
jpotts18 / create_times.sql
Last active October 13, 2020 10:33
Time Dimension for Postgres
CREATE TABLE "public"."times" (
id int4 NOT NULL,
time time,
hour int2,
military_hour int2,
minute int4,
second int4,
minute_of_day int4,
second_of_day int4,
quarter_hour varchar,
@jpotts18
jpotts18 / create_dates.sql
Last active August 29, 2023 19:49
Date Dimension for Postgres
CREATE TABLE public.dates (
id int4 NOT NULL PRIMARY KEY,
date date NOT NULL,
datetime timestamp NOT NULL,
julian_day int4 NOT NULL,
day int4 NOT NULL,
day_name varchar NOT NULL,
day_abbrev varchar NOT NULL,
Thing I want to do Command
overwrite file >
append to file >>
search in file grep '??' file.csv
search in file (case insensitive) grep 'LLC' -i file.csv
search in file (show line numbers) grep ',llc,' -n file.csv
remove header tail -n +2 file.csv > new_file.csv
remove header sed 1d file.csv > new_file.csv
add header { head -1 with_header.csv; cat headerless.csv;} > new_file.csv
-- Many-to-many table schema for users and favorites
CREATE TABLE `user_favorites` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`user_id` int(11) DEFAULT NULL,
`deal_id` int(11) DEFAULT NULL,
`created_date` datetime DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=33 DEFAULT CHARSET=utf8;
@jpotts18
jpotts18 / boston.ipynb
Last active January 15, 2020 05:58
Linear Regression on Boston Housing Data
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.