Skip to content

Instantly share code, notes, and snippets.

@tjvananne
tjvananne / ggplot2_heatmap_simple.R
Last active August 6, 2017 18:22
Create a Heat Map in R using ggplot2 with viridis Color Scale
# references:
# https://rud.is/b/2016/02/14/making-faceted-heatmaps-with-ggplot2/
# This is basically the TL;DR and it also uses a built-in dataset to foster reproducibility
library(ggplot2)
library(viridis)
gg <- ggplot(airquality, aes(x=Day, y=Month, fill=Temp))
gg <- gg + geom_tile(color='White', size=0.1)
@tjvananne
tjvananne / build_data_dictionary.R
Last active May 3, 2017 15:53
Generate Generic Data Dictionary in R. It will count the number of blanks, the number of NAs, tell you the number of unique values per column, calculate the percentages of the previously mentioned column aggregations, and report out the top n (5 is default) number of unique values per row.
# generic data dictionary creation using base-R
#' a couple notes: this could of course be done much faster using
#' third party packages, but I like to provide base-R solutions before
#' branching out into packages just in case they aren't available
#'
#' Also, this could be done in a much less verbose and modular way,
#' but I did want to also demonstrate the "Functional Programming"
@tjvananne
tjvananne / process GloVe pre-trained word vector.R
Created May 4, 2017 14:45
How to read and process a downloaded pre-trained GloVe word vector (turn it into a data.frame) in base R
#' A word vector is a giant matrix of words, and each word contains a numeric array that represents the semantic
#' meaning of that word. This is useful so we can discover relationships and analogies between words programmatically.
#' The classic example is "king" minus "man" plus "woman" is most similar to "queen"
# function definition --------------------------------------------------------------------------
# input .txt file, exports list of list of values and character vector of names (words)
proc_pretrained_vec <- function(p_vec) {
@tjvananne
tjvananne / xgb_feature_importance.R
Created July 25, 2017 19:57
Xgboost Feature Importance
# xgboost feature importance ----------------------------------------------------------------
#' Author: Taylor Van Anne
#' 7/25/2017
#'
#' This script is a simple demonstration of:
#' 1) using cross validation to determine the optimal number of iterations for your xgboost model
#' 2) runs xgboost with that number of iterations
@tjvananne
tjvananne / testing_tryCatch.R
Created August 6, 2017 18:20
tryCatch() in R -- the basics 101
#' Most basic form of a TryCatch function in R
#' tryCatch() takes two arguments: "expr" and "finally"
#'
#' "expr" is the expression you want to try. It will run as much
#' as it can until coming across an error. When it hits an error, it
#' will stop processing the code in the expression brackets and move
#' to whatever is inside of the "finally" brackets.
#'
@tjvananne
tjvananne / gradient_descent.R
Created August 7, 2017 19:38
Gradient descent implemented simply in R -- this has a source but I forgot where I got this source from (I can't take credit for this)
# set up a stepsize
alpha = 0.003
# set up a number of iteration
iter = 500
# define the gradient of f(x) = x^4 - 3*x^3 + 2
@tjvananne
tjvananne / webcam-cv2.py
Created August 16, 2017 22:32 — forked from tedmiston/webcam-cv2.py
Display the webcam in Python using OpenCV (cv2)
'''
Simply display the contents of the webcam with optional mirroring using OpenCV
via the new Pythonic cv2 interface. Press <esc> to quit.
'''
import cv2
def show_webcam(mirror=False):
cam = cv2.VideoCapture(0)
while True:
@tjvananne
tjvananne / aaa_target_shuffling.R
Last active May 31, 2022 22:37
Target Shuffling in R - iris data
#' Target Shuffling
#' Author: Taylor Van Anne
#'
#' Note: this is just my interpretation of what target shuffling means
#' to me. I think there are a few different ways to actually conduct
#' the shuffling, but this is a single approach.
#'
#' A different approach than what I did here would be to shuffle the
@tjvananne
tjvananne / 01_adaBoost_implementation_readme.txt
Last active April 12, 2021 03:53
adaBoost implementation in R - by yl3394
#=========================================================================
#STAT W4400
#Homework 03
# yl3394, Yanjin Li
#Problem 1 AdaBoost
#=========================================================================
# In this problem I will implement AdaBoost algorithm in R. The algorithm
# requires two auxiliary functions, to train and to evaluate the weak leaner.
# And, then we will have the third function for implementing the resulting
# boosting classifier. Here, we will use the decision stumps as our weak
@tjvananne
tjvananne / timestamps_in_R.R
Created January 14, 2018 20:14
R Date Timestamp Epoch Unix Time
library(lubridate) # <3 tidyverse
# this is likely not the most efficient way to generate unix time stamps, but it is intuitive to me
# I like to see how it is done step by step
# to make this more efficient, you could store the "epoch" value outside the function so it
# doesn't have to be calculated every time you call the function
time_since_epoch <- function() {