We have explored 166 695 645 points in space, which includes only one ordinal pattern (A, B, BC, C) and 117 participants.
For the whole sample (approx. 3 * 117) the estimate then becomes 500 086 935, which is three times as much.
I single run of EXIT is very fast.
## install microbenchmark for much better accuracy than system.time
devtools::install_github("olafmersmann/microbenchmarkCore")
devtools::install_github("olafmersmann/microbenchmark")
library(microbenchmark)
library(ply207) # a package I quickly made for running simulations
## setup all we need for imacEXIT
i = unique(dome21$ppt)[1]
trial_order <- data[ppt == i]$abstim
phase_index <- data[ppt == i]$phase
phase_index[phase_index == "training"] <- 0
phase_index[phase_index == "test"] <- 2
phase_index <- as.numeric(phase_index)
trials <- data[ppt == i]$trial
train <- dome21train(trial_order = trial_order, phase_index = phase_index,
trials = trials, ppt = i)
## define expressin to evaluate by microbenchmark
## imac stands for inequality matrix constructor
expr <- function() {
imacEXIT(tr = train, stimuli = c("A", "B", "C", "BC", "AB", "AC"))
}
microbenchmark(exp, times = 100000, unit = "s")
# Unit: seconds
# expr min lq mean median uq max neval
# exp 1e-05 2e-05 2.355473e-05 2e-05 2e-05 0.003967 1e+05
This is really good. The model is super fast. Let's look at the pspEXIT
algorithm itself for 10 iterations.
pspEXIT
does parameter space partitioning for all trial orders in the data by running imacEXIT
.
Here, I will only do a single trial order.
microbenchmark({
pspEXIT(data = data[ppt == i], stimuli = c("A", "B", "C", "BC"),
iteration = 10)}, times = 1)
})
# Unit: seconds
# min lq mean median uq max neval
# 2.290074 2.290074 2.290074 2.290074 2.290074 2.290074 1
## calculate the rough estimate on how many sim a second we can run on a single core
1 / (2.29 / 10)
# 4.366812
4.36
simulations per second including all the setup. Keep in mind that it is probably an underestimate, as the number of ordinal patterns probably increased by iteration, so 4.36
is only true if every iteration we ran 1 simulation.
all <- 500086935 ## all sampled points
eps <- 1 / (0.229) ## evaluation per second
eps <- eps * 96 ## on 96 core
## calculate seconds on 96 core CPU
cpu <- all / eps
cpu
# [1] 1192916
## calculate the number of days it would take
day <- 24 * 60 * 60 ## number of seconds in a day (hour * minutes * seconds)
cpu / day
# [1] 13.80689
This is only 1 out of the 20 simulations. The only thing remains is to calculate the cost.
ph <- 0.83 ## per hour cost of the vm in £
runtime <- (cpu / day) * 24 # FYI 332 hours
ph * runtime
# [1] 275.0333
£275 seems a lot, but my hunch is that it is standard for computationally intensive workloads.