Skip to content

Instantly share code, notes, and snippets.

View mick001's full-sized avatar

Michy mick001

View GitHub Profile
@mick001
mick001 / mice_imp.R
Created October 4, 2015 10:57
Imputing missing data with R; MICE package: Full article at http://datascienceplus.com/imputing-missing-data-with-r-mice-package/
# Using airquality dataset
data <- airquality
data[4:10,3] <- rep(NA,7)
data[1:5,4] <- NA
# Removing categorical variables
data <- airquality[-c(5,6)]
summary(data)
#-------------------------------------------------------------------------------
@mick001
mick001 / neuralnetR.R
Last active November 26, 2023 19:12
A neural network exaple in R. Full article at: http://datascienceplus.com/fitting-neural-network-in-r/
# Set a seed
set.seed(500)
library(MASS)
data <- Boston
# Check that no data is missing
apply(data,2,function(x) sum(is.na(x)))
# Train-test random splitting for linear model
# The following code takes as input a string of text, and then it outputs the barplot of the
# frequencies of occurrence of letters in the string.
import pylab as pl
import numpy as np
string1 = """ Example string """
alphabet = ["a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z",".",",",";","-","_","+"]
# The following functon takes a list and a string of characters, it calculates how often a certain character appears
import numpy as np
import matplotlib.pyplot as plt
from copulalib.copulalib import Copula
plt.style.use('ggplot')
def generateData():
global x,y
x = np.random.normal(size=250)
y = 2.5*x + np.random.normal(size=250)
@mick001
mick001 / logistic_regression.R
Last active October 1, 2023 10:22
Logistic regression tutorial code. Full article available at http://datascienceplus.com/perform-logistic-regression-in-r/
# Load the raw training data and replace missing values with NA
training.data.raw <- read.csv('train.csv',header=T,na.strings=c(""))
# Output the number of missing values for each column
sapply(training.data.raw,function(x) sum(is.na(x)))
# Quick check for how many different values for each feature
sapply(training.data.raw, function(x) length(unique(x)))
# A visual way to check for missing data
@mick001
mick001 / mtlbl_clf.R
Created August 25, 2016 00:37
Multilabel classification using R and the neuralnet package
################################################################################
# Loading data
rm( list=ls() )
# load libs
require(neuralnet)
require(nnet)
# Load data and set names
@mick001
mick001 / copulas_example.R
Last active April 15, 2023 09:17
Modelling dependence with copulas. Full article at: http://datascienceplus.com/modelling-dependence-with-copulas/
#Load library mass and set seed
library(MASS)
set.seed(100)
# We are going to use 3 random variables
m <- 3
# Number of samples to be drawn
n <- 2000
@mick001
mick001 / CopulaClass.py
Created August 29, 2015 11:28
CopulaClass a Python class for using copulas: a fitting example. Full article at http://www.firsttimeprogrammer.blogspot.com/2015/02/copulaclass-python-class-for-using.html
# Copula class
import numpy as np
import matplotlib.pyplot as plt
from copulalib.copulalib import Copula
from scipy.stats import norm
plt.style.use('ggplot')
class copulaClass(object):
@mick001
mick001 / dwg_to_pdf_printing_bot.py
Created September 3, 2017 21:22
Dwg to pdf printing bot.
# Imports
import os
import sys
import time
import psutil
import logging
import pyautogui as pgui
from datetime import datetime
# A simple Markov chain model for the weather in Python
import numpy as np
import random as rm
import time
# Let's define the statespace
states = ["Sunny","Cloudy"]
# Possible sequences of events