Skip to content

Instantly share code, notes, and snippets.

@mwidjaja1
mwidjaja1 / Pizza.py
Created September 29, 2015 16:18
In this Kaggle.com project, I wanted to predict the probability of Reddit's Random Acts of Pizza group giving a free pizza in regards to its history of requests. This project was one of my first introductions in analyzing text for Machine Learning.
"""
On Training, this scored 0.736
Magnitude: 3 default, NO synonyms, 3 lemmas, 4 Correct
When Leaf = 30 & Depth = 20
False: 78% as 1401 and True: 40% as 86
On Training, this scored 0.733
Magnitude: 3 default, 0.5 synonyms, 3 lemmas, 4 Correct
When Leaf = 30 & Depth = 20
@mwidjaja1
mwidjaja1 / Outlier_packaged.py
Created September 29, 2015 15:49
By using Numpy & Sklearn, I was able to eliminate outliers and produce a more precise fit of net worth vs. age which is plotted by the blue line.
""" outlier_packaged.py --------------------------------------------------------
Goal: Importing two pickled data sets (with either the 'X' or 'Y' values),
we'll remove 10% of the outliers, which are those with the largest
error between the predictions made by our regression model vs. the
actual values.
Input: No traditional input argument but the user should put the
practice_outliers_ages.pkl & practice_outliers_net_worths.pkl (or
the pickled data set we're using) in the same folder as this file.
@mwidjaja1
mwidjaja1 / NYCSubway.py
Last active December 26, 2017 22:45
With NYC Subway Data from 2009-2011, I analyzed each subway line to derive conclusions regarding on-time performance & reliability.
""" Bokeh (NYC Subway) ---------------------------------------------------------
Goal: This script takes the NYC Subway data & parses it for visualization
techniques.
Input: http://web.mta.info/developers/performance.html has performance XML
data. We only cover subway and we only care for these metrics:
1. Subway wait assessment for all lines
Actual interval between trains
2. Mean Distance Between Failure
Miles until a train hits a mechanical failure causing a delay
@mwidjaja1
mwidjaja1 / gist:10609173
Created April 14, 2014 01:10
Function for Sediment Concentration in a River
%% Matthew Widjaja.
% Eulers Method Function
% Instructions: Run the function from OneMATLAB.m
%% Function
% This is the function used for OneMATLAB.m
function f = f(c)
global volumeV areaA baseFlowQ sediFlowM sediVelocityVs
@mwidjaja1
mwidjaja1 / Sediment.matlab
Last active September 29, 2015 16:08
With MATLAB, I calculated the concentrations of Algae, Zooplankton, Oxygen, & Carbon in a river as they interact and are consumed.
%% Matthew Widjaja
% Environmental Modeling -- Program 1
% Parking Lot & Lake
clear t c
format short
global volumeV areaA baseFlowQ sediFlowM sediVelocityVs
%% Obtain Data
@mwidjaja1
mwidjaja1 / gist:10607578
Created April 14, 2014 00:12
MySql & Dislin for GNP vs. Life Expectancy
#include <mysql.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
// #include <dislin.h>
#include "/Users/Matthew/dislin/examples/dislin.h"
/* --- PROJECT 5/6: MYSQL & DISLIN INTEGRATION ---
This project was created by Matthew Widjaja. Nov 2013.
@mwidjaja1
mwidjaja1 / pendulum.c
Last active September 29, 2015 15:56
I plotted a 10 meter long Foucault Pendulum (one that swings in both the x & y directions) with respect to the earth's rotation in Quebec Canada by using C and its standard libraries.
/*
Matthew Widjaja
Project 2: Pendulum Program
*/
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
int main()
@mwidjaja1
mwidjaja1 / NelderMead.c
Last active September 29, 2015 16:04
My first experience with C led me to the Nelder Mead Numerical Optimization algorithm, to find the local minimum.
/*
Matthew Widjaja
Project 1: Nelder Mead Numerical Optimization
*/
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
main()