Skip to content

Instantly share code, notes, and snippets.

View Alrecenk's full-sized avatar

Alrecenk Alrecenk

View GitHub Profile
@Alrecenk
Alrecenk / NaiveBayesConstructor.java
Last active December 22, 2015 13:59
Naive Bayes Classifier Construction
//constructs a naive Bayes binary classifier
public NaiveBayes(double in[][], boolean out[]){
int inputs = in[0].length ;
//initialize sums and sums of squares for each class
double[] poss = new double[inputs], poss2 = new double[inputs];
double[] negs = new double[inputs], negs2 = new double[inputs];
//calculate amount of each class, sums, and sums of squares
for(int k=0;k<in.length;k++){//for each data point
if(out[k]){
positives++;//keep track of total positives
@Alrecenk
Alrecenk / NaiveBayesExample.java
Created September 8, 2013 07:39
Naive Bayes Example Calculation
public double ProbabilityOfInputIfPositive(double in[]){
double prob = 1/Math.sqrt(2 * Math.PI) ;
for(int j=0; j<in.length;j++){
prob*= Math.exp(- (in[j]-posmean[j])*(in[j]-posmean[j]) / (2*posvariance[j]) ) ;
}
return prob ;
}
@Alrecenk
Alrecenk / NaiveBayesApplication.java
Created September 8, 2013 07:42
Calculate the output for a Naive Bayes classifier.
//Calculate the probability that the given input is in the positive class
public double probability(double in[]){
double relativepositive=0,relativenegative=0;
for(int j=0; j<in.length; j++){
relativepositive += (in[j]-posmean[j])*(in[j]-posmean[j]) / posvariance[j] ;
relativenegative += (in[j]-negmean[j])*(in[j]-negmean[j]) / negvariance[j] ;
}
relativepositive = positives*Math.exp(0.5*relativepositive) ;
relativenegative = negatives*Math.exp(0.5*relativenegative) ;
return relativepositive / (relativepositive + relativenegative) ;
@Alrecenk
Alrecenk / LeastSquaresTrain.java
Last active August 1, 2016 04:16
This code provides all functions necessary to perform and apply a least squares fit of a polynomial from multiple inputs to multiple outputs. The fit is performed using an in-place LDL Cholesky decomposition based on the Cholesky–Banachiewicz algorithm.
//performs a least squares fit of a polynomial function of the given degree
//mapping each input[k] vector to each output[k] vector
//returns the coefficients in a matrix
public static double[][] fitpolynomial(double input[][], double output[][], int degree){
double[][] X = new double[input.length][];
//Run the input through the polynomialization and add the bias term
for (int k = 0; k < input.length; k++){
X[k] = polynomial(input[k], degree);
}
int inputs = X[0].length ;//number of inputs after the polynomial
@Alrecenk
Alrecenk / basicLDL.java
Last active December 23, 2015 17:09
A basic LDL decomposition of a matrix X times its transpose.
double[][] L = new double[inputs][ inputs];
double D[] = new double[inputs] ;
//for each column j
for (int j = 0; j < inputs; j++){
D[j] = XTX[j][j];//calculate Dj
for (int k = 0; k < j; k++){
D[j] -= L[j][k] * L[j][k] * D[k];
}
//calculate jth column of L
L[j][j] = 1 ; // don't really need to save this but its a 1
@Alrecenk
Alrecenk / basicLDLsolve.java
Last active December 23, 2015 17:09
Solves for C given an LDL decomposition in the form LDL^T C = X^T Y.
public double[] solvesystem(double L[][], double D[], double XTY[]){
//back substitution with L
double p[] = new double[XTY.length] ;
for (int j = 0; j < inputs; j++){
p[j] = XTY[j] ;
for (int i = 0; i < j; i++){
p[j] -= L[j][i] * p[i];
}
}
//Multiply by inverse of D matrix
@Alrecenk
Alrecenk / RotationForestSimple.java
Last active December 25, 2015 17:49
An optimized rotation forest algorithm for binary classification on fixed length feature vectors.
/*A rotation forest algorithm for binary classification with fixed length feature vectors.
*created by Alrecenk for inductivebias.com Oct 2013
*/
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Random;
public class RotationForestSimple{
double mean[] ; //the mean of each axis for normalization
@Alrecenk
Alrecenk / pseudonaivetreelearn.java
Last active December 26, 2015 01:59
Pseudocode for a naive implementation of a decision tree learning algorithm.
int splitvariable=-1; // split on this variable
double splitvalue ;//split at this value
// total positives and negatives used for leaf node probabilities
int totalpositives,totalnegatives ;
Datapoint trainingdata[]; //the training data in this node
treenode leftnode,rightnode;//This node's children if it's a branch
//splits this node greedily using approximate information gain
public void split(){
double bestscore = Maxvalue ;//lower is better so default is very high number
@Alrecenk
Alrecenk / pseudobootstrap.java
Last active December 26, 2015 04:39
Bootstrap aggregation for a random forest algorithm.
//bootstrap aggregating of training data for a random forest
Random rand = new Random(seed);
treenode tree[] = new treenode[trees] ;
for(int k=0;k<trees;k++){
ArrayList<Datapoint> treedata = new ArrayList<Datapoint>()
for (int j = 0; j < datapermodel; j++){
//add a random data point to the training data for this tree
int nj = Math.abs(rand.nextInt())%data.size();
treedata.add(alldata.get(nj)) ;
}
@Alrecenk
Alrecenk / normalize.java
Last active December 26, 2015 22:09
Normalizing a vector to length one, normalizing a data point into a distribution of mean zero and standard deviation of one, and generating a vector by a normal distribution. Different operations that are named similarly and might be confusion.
//makes a vector of length one
public static void normalize(double a[]){
double scale = 0 ;
for(int k=0;k<a.length;k++){
scale+=a[k]*a[k];
}
scale = 1/Math.sqrt(scale);
for(int k=0;k<a.length;k++){
a[k]*=scale ;
}