Skip to content

Instantly share code, notes, and snippets.

View AzharuddinKazi's full-sized avatar

Azharuddin Kazi AzharuddinKazi

  • Deloitte & Touche
  • Dubai, United Arab Emirates
View GitHub Profile
@AzharuddinKazi
AzharuddinKazi / MonteCarloSimulation.R
Created February 22, 2017 17:49
We want to identify the probability that any two random individuals have the same week day as their day of birth. Write a code using Monte Carlo simulation method to solve this problem.
rm(list=ls(all=TRUE))
fnBirthDay <- function(simulations){
set.seed(1234)
yes = 0
#no =0
x=sample(1:7, simulations,replace = TRUE)
y=sample(1:7, simulations,replace = TRUE)
@AzharuddinKazi
AzharuddinKazi / GeneticAlgorithm.R
Created February 22, 2017 17:44
the aim is to pickup items so as to maximize the survival points but still being under the specified weight constraint
rm(list=ls(all=TRUE))
dataset <- data.frame(item = c("pocketknife", "beans", "potatoes", "onions","phone", "lemons",
"sleeping bag", "rope", "compass","umbrella","sweater","medicines","others"),
survivalpoints = c(15, 16, 13, 14, 20,12,17,18,17,19,10,12,11),
weight = c( 5, 6, 3, 4,11,2,7,8, 10,9,1,12,11))
sum(dataset$survivalpoints)
sum(dataset$weight)
@AzharuddinKazi
AzharuddinKazi / SIngularValueDecomposition.r
Created February 22, 2017 17:33
Demo recommendation system for songs using Singular value Decomposition in R
# Rows are users (5 users) and # columns are songs (6 songs)
N=matrix(c(5,0,3,4,3,2,
2,1,4,0,0,5,
1,1,1,0,2,1,
1,0,0,0,5,5,
4,3,2,3,4,1),byrow=T,ncol=6)
rownames(N)= c("User1","User2","User3","User4","User5")
colnames(N) = c("Song1","Song2","Song3","Song4","Song5","Song6")
# UserId = c("User1","User2","User3","User4","User5")
@AzharuddinKazi
AzharuddinKazi / TextClassification.py
Created February 18, 2017 18:03
text classification using naive bayes classifier in python
import os
import pandas as pd
import re
import numpy as np
from sklearn.metrics import confusion_matrix
import random
import nltk
from sklearn.metrics import recall_score, precision_score, accuracy_score
from sklearn.naive_bayes import MultinomialNB
@AzharuddinKazi
AzharuddinKazi / SupportVectorMachines.r
Created February 13, 2017 12:15
Running support vector machines in r
#Loading Data into R:
bankdata=read.csv("E:\\UniversalBank.csv", header=TRUE, sep=",")
#Data preparation
#(a) to remove the columns ID & ZIP
bankdata_1 = subset(bankdata, select=-c(ID, ZIP.Code))
#(b) To create dummy variables for the categorical variable “Education” and add those dummy variables to the original data.
library("dummies")
Educations = dummy(bankdata_1$Education)
@AzharuddinKazi
AzharuddinKazi / Clustering.r
Created February 12, 2017 16:20
applying simple clustering algorithms like hierarchical and k-means clustering
rm(list=ls(all=TRUE))
#Consider mtacrs data of R-datasets
data(mtcars)
mydata <- data.frame(mtcars)
mydata <- na.omit(mydata) # listwise deletion of missing
summary(mydata)
str(mydata)
mydata <- scale(mydata) # standardize variables
@AzharuddinKazi
AzharuddinKazi / KNN & PCA.r
Created February 10, 2017 16:14
implementation of k nearest neighbour and then applying PCA for dimentionality reduction
#### problem statement: given data about different customers, we have to classify prospective loan takes, i.e, classify loan takes and non-loan takers
### reading from a dataset named Universal Bank
data<-read.csv("E:\\UniversalBank.csv",header=T)
data1=subset(data, select=-c(ID,ZIP.Code))
str(data1)
## segregating the categorical and numeric variables
@AzharuddinKazi
AzharuddinKazi / DecisionTreeIris.r
Last active February 9, 2017 12:56
here we are using the inbuilt dataset provided by r and applying c5.0 and rpart decision tree algorithms to classify them in to 3 classes.
#Predict flower species(classify)
iris
head(iris)
dim(iris)
names(iris)
str(iris)
table(iris$Species)
#split train and test
set.seed(1234)