Skip to content

Instantly share code, notes, and snippets.

View nassimhaddad's full-sized avatar

Nassim Haddad nassimhaddad

View GitHub Profile
@nassimhaddad
nassimhaddad / sampling_example.R
Last active December 11, 2015 03:39
Sampling with probability distribution
# generate example:
prob <- c(2, rep(1,9))
v <- vector()
for (i in 1:10000){
v <- c(v, sample(1:10, 1, prob = prob))
}
# print the output:
print(table(v))
@nassimhaddad
nassimhaddad / matrix_size_test.R
Last active December 11, 2015 03:39
Matrix size test
nrow <- 100000
ncol <- 100
big_matrix <- matrix(sample(c(0,1,2), nrow * ncol, replace = TRUE), ncol = ncol)
print(object.size(big_matrix), units = "Mb")
# 76.3 Mb
@nassimhaddad
nassimhaddad / download_gist.R
Last active December 11, 2015 03:48
downloads a gist to a file
# downloads a gist
# usage: download_gist("4539629", "4539629.R")
#
# see also: source_gist {devtools}
download_gist <- function (entry, destfile, ...)
{
if (is.numeric(entry) || grepl("^[[:digit:]]+$", entry)) {
entry <- paste("https://raw.github.com/gist/", entry,
sep = "")
}
@nassimhaddad
nassimhaddad / debugging_in_R.md
Last active December 11, 2015 03:48
debugging process in R

Routine:

  1. When an error occurs, the first thing that I usually do is look at the stack trace by calling traceback(): that shows you where the error occurred, which is especially useful if you have several nested functions.
  2. Next I will set options(error=recover); this immediately switches into browser mode where the error occurs, so you can browse the workspace from there.
  3. If I still don't have enough information, I usually use the debug() function and step through
@nassimhaddad
nassimhaddad / progressbaR.R
Created January 15, 2013 16:44
example code for a progress bar in R
pb <- txtProgressBar(style = 3, min = 0, max = length(files_list))
i <- 0
for (file in files_list){
i <- i+1
setTxtProgressBar(pb, i)
do_computations()
}
# Set up some example data
year <- sample(1970:2008, 1e6, rep=T)
state <- sample(1:50, 1e6, rep=T)
group1 <- sample(1:6, 1e6, rep=T)
group2 <- sample(1:3, 1e6, rep=T)
myFact <- rnorm(100, 15, 1e6)
weights <- rnorm(1e6)
myDF <- data.frame(year, state, group1, group2, myFact, weights)
system.time({
df <- data.frame(f = 1:4, g = letters[1:4])
df$g <- factor(df$g, levels = letters[4:1])
@nassimhaddad
nassimhaddad / read_from_clipboard.R
Last active December 11, 2015 04:29
read data from clipboard, works with excel
# windows
x <- read.delim(file("clipboard","r"),
header=TRUE,
stringsAsFactors = FALSE)
# mac
data <- read.table(pipe("pbpaste"), sep="\t", header=T)
# read from and write to clipboard with Kmisc (windows + OS X):
library(Kmisc)
@nassimhaddad
nassimhaddad / get_word_count.R
Created January 16, 2013 08:26
function that counts the number of words (= delimited by " ") in a string.
get_word_count <- function(string){
length(unlist(strsplit(as.character(string), " ")))
}
@nassimhaddad
nassimhaddad / hist_compare.R
Created January 16, 2013 12:06
compare histograms by plotting their sensity functions in the same chart
plot(density(data1))
lines(density(data2), col = blue)