Skip to content

Instantly share code, notes, and snippets.

@markdanese
markdanese / feather_test.R
Last active April 22, 2016 11:35
A test of the new feather package in R using Medicare Part D drug reimbursement data
# load libraries --------------------------------------------------------------------
library(data.table)
library(feather)
# US Part D Drug prices 2013: 500 MB zip, 2.9 GB uncompressed -----------------------
pde_link <- "http://download.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/Medicare-Provider-Charge-Data/Downloads/PartD_Prescriber_PUF_NPI_DRUG_13.zip"
tf <- tempfile()
library(data.table)
library(microbenchmark)
performanceTest <- function (nrows=100000, ncols=400, quote=TRUE, col.type='character') {
message(paste0('generating test table, col type ', col.type))
x <- c(1:nrows)
dt <- data.table(col1=x)
col_generators = list(
@daroczig
daroczig / get-data.R
Last active April 4, 2024 20:23
Number of R packages submitted to CRAN
## original idea & report by Henrik Bengtsson at
## https://stat.ethz.ch/pipermail/r-devel/2016-February/072388.html
## This script downloads the list of currently published R packages
## from CRAN and also looks at all the archived package versions to
## combine these into a list of all R packages ever published on
## CRAN with the date of first release.
## CRAN mirror to use
CRAN_page <- function(...) {
@jboner
jboner / latency.txt
Last active May 13, 2024 12:48
Latency Numbers Every Programmer Should Know
Latency Comparison Numbers (~2012)
----------------------------------
L1 cache reference 0.5 ns
Branch mispredict 5 ns
L2 cache reference 7 ns 14x L1 cache
Mutex lock/unlock 25 ns
Main memory reference 100 ns 20x L2 cache, 200x L1 cache
Compress 1K bytes with Zippy 3,000 ns 3 us
Send 1K bytes over 1 Gbps network 10,000 ns 10 us
Read 4K randomly from SSD* 150,000 ns 150 us ~1GB/sec SSD