Skip to content

Instantly share code, notes, and snippets.

@ribsy
Last active July 15, 2022 14:34
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ribsy/111b2fa86d83b79517f7359d2991b357 to your computer and use it in GitHub Desktop.
Save ribsy/111b2fa86d83b79517f7359d2991b357 to your computer and use it in GitHub Desktop.
This is a simple probabilistic programming example for AB testing security products. Its merely a simple starter example...with more to come. You can read more about this at https://www.soluble.ai/blog/security_budgets_at_risk
#' ---
#' title: 3 Metrics For Getting Better Security ROI
#' author: Richard Seiersen
#' date: June 4, 2020
#' output:
#' html_document:
#' toc: true
#' highlight: zenburn
#' ---
#' ##Why did I do this?
#' To answer this question in whole start here: [Soluble Blog: Security Budgets At Risk](www.soluble.ai/blogs/security-budgets-at-risk)
#'
#' Also, a fellow startup founder was having trouble demonstrating value in POCs. He wanted a way to more concretely
#' demonstrate product effectiveness in terms of overall costs impact. This is the result of that experiment.
#'
#' I make no claims about its absolute effectivness. But, I do believe it beats the competing model which is
#' nine time out of ten you unaided intuition. What I mean is, while this tool will not answer the question,
#' "should I buy this solution or not!?" It will move you in the direction of a mathematically unambigous and
#' consistent approach to comparing some products.
#'
#' ##Point Of View for this Demo
#' 1. When figuring out which security solution to buy you need to consider sticker costs and actual cost.
#' 2. Tools that do the right thing may produce cost - but I consider that the cost of doing business.
#' 3. Tools that make errors create waste in the form of extra re-work and risk (in the case of security)
#' 4. We want to measure all of that - but we emphasize **waste measurement** in this excercise.
#' 5. And we want to do that simply and relatively fast with small data.
#' 6. This script allows you to compare two products under test so you can
#' make a more informed buying choice...even if the test size is tiny.
#' 7. [See Soluble blog for more](www.soluble.ai/blogs/security-budgets-at-risk)
#'
#'
#' ###Software You Will Need
#' * [You need R](https://www.r-project.org/)
#' * [R Studio](https://rstudio.com/products/rstudio/download/)
#'
#' ###Libraries You Will Need
#' There are generally two classes of libraries in use: Bayesian and Visualization. I am not necessarily endorsing any of these,
#' they just got the job done.
#' <br>
#' * Bayes: ProbBayes, Bolstad, mc2d (sorta), bayestestR.
#' <br>
#' * Visual: ggplot2, ggdistribute, scales, knitr
#' <br>
# Specify packages, and install missing ones.
my_packages <- c("ProbBayes", "Bolstad", "ggplot2", "ggdistribute",
"mc2d", "bayestestR", "scales", "knitr")
not_installed <- my_packages[!(my_packages %in% installed.packages()[ , "Package"])]
if (length(not_installed)){
msgVal = paste("Install packages: ",toString(not_installed))
msgRes <- askYesNo(msgVal)
if( msgRes != TRUE){
stopMsg <- paste("Quiting because you don't want to install:", not_installed)
stop(stopMsg)
}
}
if(length(not_installed)) install.packages(not_installed)
library(ProbBayes)
library(Bolstad)
library(ggplot2)
library(ggdistribute)
library(mc2d)
library(bayestestR)
library(scales)
library(knitr)
#'
#' Run this to get an html version: rmarkdown::render("generative_model_blog.R",output_file = "generative_model_blog_report.html")
#'
#' ##Configuration Elements
#' ### These are the only things you need to play with to run an anaysis.
#'
#' You edit these for your particular environment and test size.
#' You do not need to understand any of the code toward the bottom of this page. That would likely be torture
#' Your job is to change these parameters to fit your particular use case.
#'
#' ##Base Configs product A and product B defined
#' * Events (a_n_events/b_n_events) consists of errors and non-errors. It's the total set
#' * An error could be a false positive or a false negative as described in the article
#' * These break down into a few **implied** sets. I say implied because you are only
#' tracking the total events and the errors (false positives and negatives)
#' + Shared set of TRUE events
#' + Shared set of False Positives
#' + Shared set of False Negatives (discovered by some other solution)
#' + False Positive unique to one product
#' + False Negatives unique to one product
#' * IF the sum total events from one product are less than the TRUE events of another product
#' the script will throw an error. It means you missed one or more false negatives. That is the only
#' error checking that really matters. You could have countless false positives for one product and not
#' the other.
# These parameters set the basis for the analyssis in terms of simulations, events and timeframe
base_configs <- list()
base_configs["n_draws"] <- 100000 # Number for simulations 100K default - no need to change really
base_configs["a_n_events"] <- 47 # Total events for test like 100 vulns, phising events etc - could be 10
base_configs["b_n_events"] <- 40 # Total events for test like 100 vulns, phising events etc - could be 10
base_configs["time_multiple"] <- 12 # Time period: If a month then use 12, if a day use 365
# Beliefs about error counts prior to testing
event_priors <- list()
event_priors["median_errors"] <- 0.20 # True rate is just as likely above/below this
event_priors["edge_errors"] <- 0.45 # 90% sure true value is below this
# What was the real count after testing. it's in relation to base_configs["n_events"] above.
# This includes false positives and discoverable false negatives (deltas found between the solutions)
# Note, the solutions also will correct for some of the false negative errors dynamically.
event_counts <- list()
event_counts["product_a_errors"] <- 14 # After conducting test, what are the real errors for prod a
event_counts["product_b_errors"] <- 7 # After conducting test, what are the real errors for prod b
# Forecasted range of error response hours that can happen per event
eng_hour_range <- list()
eng_hour_range["mode"] <- 3 # Expect hours
eng_hour_range["low"] <- 1 # Low hours
eng_hour_range["high"] <- 5 # Max hours
# Forecasted cost of responding to an error event for ONE HOUR
eng_cost_range <- list()
eng_cost_range["mode"] <- 600 # Expected cost per event
eng_cost_range["low"] <- 200 # Low end cost per event
eng_cost_range["high"] <- 2000 # High end cost per event
#Year 1 Deterministic Cost of Product A & B license or subscrption etc
product_costs <- list()
product_costs["prod_a_cost"] <- 72000.00
product_costs["prod_b_cost"] <- 65000.00
product_costs["prod_a_run"] <- 10000 # There could be a material cost to operate product A
product_costs["prod_b_run"] <- 10000 # Perhaps product B is less expensive becuase its SaaS
#' ##Simplifying Functions
#' This are tiny functions that do computations, graphs or print data to screen.
#'
# Check error inputs and quit if not correct
ErrorCheck <- function(base_configs, event_counts){
a_total <- as.numeric(base_configs["a_n_events"])
a_errors <- as.numeric(event_counts["product_a_errors"])
b_total <- as.numeric(base_configs["b_n_events"])
b_errors <- as.numeric(event_counts["product_b_errors"])
#Remove errors to get on true events
a_true <- a_total - a_errors
b_true <- b_total - b_errors
#cat(paste("A Total:",a_total,"A Errors:", a_errors,"A True:", a_true, "\n"))
#cat(paste("B Total:",b_total,"A Errors:", b_errors,"B True:", b_true, "\n"))
#Is Prod A/B Missing errors - false negatives specifically?
if( a_true < b_true ){
diff <- b_true - a_true
cat(paste("Adding ", diff ," FALSE NEGATIVE errors to product A for a new total of: ",diff + a_errors,"\n"))
return(c("product_a_errors", diff + a_errors))
}
if( b_true < a_true ){
diff <- a_true - b_true
cat(paste("Adding ", diff ," FALSE NEGATIVE errors to product B for a new total of: ",diff + b_errors,"\n"))
return(c("product_b_errors", diff + b_errors))
}
if( b_total < a_true ){
diff <- a_true - b_total
cat(paste("Product B is missing", diff, "false positive errors."))
stop("Exiting")
}
if( a_total < a_errors){
stop("Exiting: Your errors for product A exceed your total events for product A")
}
if( b_total < b_errors){
stop("Exiting: Your errors for product B exceed your total events for product B")
}
return(c("No Change", 0))
}
CatTotalCost <- function(total_costs, product_costs){
#Get Cost Ratios
prod_a_total_costs <- total_costs[1]
prod_b_total_costs <- total_costs[2]
prod_ratio <- round( prod_a_total_costs / prod_b_total_costs, 3) * 100
prod_ratio <- paste(prod_ratio, "%", sep = "")
#Format as dollars
prod_a_orginal_cost <- dollar_format()(as.numeric(product_costs["prod_a_cost"]))
prod_a_total_final <- dollar_format()(round(as.numeric(prod_a_total_costs),2))
prod_b_orginal_cost <- dollar_format()(as.numeric(product_costs["prod_b_cost"]))
prod_b_total_final <- dollar_format()(round(as.numeric(prod_b_total_costs),2))
#Print Message to stdout
cat("--------Which Product Cost Most To Operate Given Forecasted Errors-------\n")
cat(paste(" Product A is", prod_ratio , "the total cost of product B\n"))
cat(paste(" Product A (with base price of ", prod_a_orginal_cost ,") is expected to cost to operate with errors: ", prod_a_total_final,"\n"))
cat(paste(" Product B (with base price of ", prod_b_orginal_cost,") is expected to cost to operate with errors: ", prod_b_total_final,"\n"))
cat("\n Still uncertain? Get more data by running more tests!")
return(c(prod_ratio, prod_a_orginal_cost, prod_a_total_final, prod_b_orginal_cost, prod_b_total_final))
#print(knitr::kable(out))
}
GraphErrorImpact <- function(posterior, impact){
newdf_a_impact <- data.frame(posterior = posterior$prod_a_impact, product = "Product A")
newdf_b_impact <- data.frame(posterior = posterior$prod_b_impact, product = "Product B")
newdf_ab_impact <- rbind(newdf_a_impact, newdf_b_impact)
impact_text <- paste(" Product A is", impact[1] , "the total cost of product B\n Product A (with base price of ", impact[2] ,") is expected to cost to operate with errors: ", impact[3],"\n Product B (with base price of ", impact[4],") is expected to cost to operate with errors: ", impact[5],"\n")
print(ggplot(newdf_ab_impact, aes(x = posterior, y = ..density.., fill = product)) +
stat_density_ci(alpha = 0.5, n = 1024, geom = "density",position = "identity") +
scale_fill_manual(values=c("#333399", "#CCCCFF")) +
labs(title = "FORECASTING APPSEC ERROR IMPACT\nUsing Bayesian A/B Testing\n", x="Using Small Samples: Predicted Difference In Product Error Impacts",
subtitle = impact_text,
y = "Product Error Comparisons"))
}
CatErrorDiffProp <- function(posterior){
error_rate_diff <- sum(posterior$prop_diff_events > 0 )/length(posterior$prop_diff_events)
cat(paste("--------Error Rate Difference For This Simmulation--------\n Probability that product A makes more errors that Product B is ",round(error_rate_diff * 100) ,"%\n\n", sep = ""))
}
GraphErrorDiff <- function(posterior){
xval <- posterior$prop_diff_events
breaks <- pretty(range(xval), n = nclass.FD(xval), min.n = 1)
bwidth <- breaks[2]-breaks[1]
print(ggplot(posterior, aes(x=prop_diff_events)) +
geom_histogram(aes(y=..density..), colour="black", fill="white", binwidth = bwidth)+
geom_density(alpha=.2, fill="#333399") + labs(title="Event Amount Difference: Product A - Product B",x="Event Rate", y = "Amount")+
geom_vline(xintercept=0, color="red", size=1, linetype = "dotted")
)
}
GraphErrorRates <- function(posterior){
newdf_a <- data.frame(posterior = posterior$prod_a, product = "Product A")
newdf_b <- data.frame(posterior = posterior$prod_b, product = "Product B")
newdf_ab <- rbind(newdf_a, newdf_b)
print(ggplot(newdf_ab, aes(x = posterior, y = ..density.., fill = product)) +
stat_density_ci(alpha = 0.5, n = 1024, geom = "density",position = "identity") +
scale_fill_manual(values=c("#333399", "#CCCCFF")) +
labs(title = "Product A/B Testing", x="Using Small Samples: Predicted Difference In Product Error Rates",
y = "Product Comparisons"))
}
CatUpdatedBeliefRates <- function(posterior, event_counts, base_configs){
out <- "--------Updated Forecasts Given Small Data, Beliefs & Lots of Simulations--------------\n"
# 89% Highest Density Intervals
prod_a_points <- point_estimate(posterior$prod_a)
ci_hdi_prod_a <- ci(posterior$prod_a, method = "HDI")
out <- paste(out, " The previous product A forecast combined with ", base_configs["a_n_events"] ," events and ", event_counts["product_a_errors"] ," errors is between:",
round(ci_hdi_prod_a$CI_low,2) * 100, "% and ",
round(ci_hdi_prod_a$CI_high,2) * 100, "% \n And centered near:",
round(prod_a_points[[1]],2) * 100, "%\n\n", sep="")
# 89% Highest Density Interval
prod_b_points <- point_estimate(posterior$prod_b)
ci_hdi_prod_b <- ci(posterior$prod_b, method = "HDI")
out <- paste(out, " The previous product B foreacst combined with ", base_configs["b_n_events"] ," events and ", event_counts["product_b_errors"]," errors is between:",
round(ci_hdi_prod_b$CI_low,2) * 100, "% and ",
round(ci_hdi_prod_b$CI_high,2) * 100, "% \n And centered near:",
round(prod_b_points[[1]],2) * 100, "%\n\n", sep="")
out <- paste(out, "See the markdown report or plot tab graphs in rstudio to see forecasts combined with data.\n")
cat(out)
}
CatPriorBeliefRates <- function(event_priors){
#Pring Beta Curve
#print(MakeCurves(event_priors[["median_errors"]], event_priors[["edge_errors"]],color="Purple", CI=TRUE, CIT=TRUE))
xval = "Event Probability"
yval = "Event Strength"
tval = "Credible Beliefs About Events"
sval = "With 95% Credible Belief Interval"
beta_vals <- GetBeliefsEvents(event_priors[["median_errors"]], event_priors[["edge_errors"]] )
print(ggplotBeta(beta_vals[1], beta_vals[2], xlab = xval, ylab = yval, tlab = tval, sval = sval))
#Get Beta Prio CIs
priorCI <- GetBetaCI(event_priors["median_errors"], event_priors["edge_errors"])
cat("------Prior Beliefs About Error Rates--------\n")
cat(paste(" You forecasted an ERROR RATE equally ABOVE/BELOW ", round(event_priors[["median_errors"]]*100), "%.\n", sep=""))
cat(paste(" You're also 90% sure the error rate is LESS THAN ", round(event_priors[["edge_errors"]]*100), "%.\n", sep=""))
cat(paste(" Given that forecast and the model we are using you you believe the true rate likely sits between ", priorCI[1],"% and ", priorCI[3], "%\n\n", sep=""))
}
GetTotalCosts <- function(impacts, costs, time_multiple, event_counts, prior_events, base_configs){
product_a_counts <- event_counts[[1]]
product_b_counts <- event_counts[[2]]
# Get Gamma Prior for product a which will get fed into poisson dist in GetEventFrequency()
a_number_events <- as.numeric(base_configs$a_n_events)
a_gamma_median_prior <- prior_events[[1]] * a_number_events
a_gamma_sd <- prior_events[[2]] * a_number_events
# Get Gamma Prior for product b which will get fed into poisson dist in GetEventFrequency()
b_number_events <- as.numeric(base_configs$b_n_events)
b_gamma_median_prior <- prior_events[[1]] * b_number_events
b_gamma_sd <- prior_events[[2]] * b_number_events
prod_a_event_results <- GetEventFrequency(product_a_counts, a_gamma_median_prior, a_gamma_sd)
prod_a_mean_events <- prod_a_event_results$mean
prod_a_event_totals <- prod_a_mean_events * as.numeric(time_multiple)
prod_b_event_results <- GetEventFrequency(product_b_counts, b_gamma_median_prior, b_gamma_sd)
prod_b_mean_events <- prod_b_event_results$mean
prod_b_event_totals <- prod_b_mean_events * as.numeric(time_multiple)
prod_a_total_cost <- prod_a_event_totals * impacts$prod_a_impact + costs$prod_a_cost + costs$prod_a_run
prod_b_total_cost <- prod_b_event_totals * impacts$prod_b_impact + costs$prod_b_cost + costs$prod_b_run
return(c(prod_a_total_cost, prod_b_total_cost) )
}
GetEventFrequency <- function(event_counts, avg_event_count, event_variation, graph = FALSE){
#rate vals can be a vector of results, perhaps events per day, week , month etc
# mode and sd will be converted into rate and scale.
event_count_priors <- GetRatePriors(avg_event_count, event_variation, graph)
result <- poisgamp(event_counts, event_count_priors[1], event_count_priors[2], plot = FALSE, suppressOutput = TRUE)
return(result)
}
GetRatePriors <- function(mode, sd, graph = FALSE){
# This function allows you to think in terms of mode and SD.
# Note that everything is positive..so while the mode might be a small number that is close to 0
# The SD can be large skewing far to the right, without going past zero on the left.
# Here are the corresponding rate and shape parameter values:
ra = ( mode + sqrt( mode^2 + 4*sd^2 ) ) / ( 2 * sd^2 )
sh = 1 + mode * ra
if(graph){
x = seq(0,mode+5*sd,len=1001)
plot( x , dgamma( x , shape=sh , rate=ra ) , type="l" ,
main=paste("dgamma, mode=",mode,", sd=",sd,sep="") ,
ylab=paste("dgamma( shape=",signif(sh,3)," , rate=",signif(ra,3)," )",
sep="") )
abline( v=mode , lty="dotted" )
}
return(c(sh, ra))
}
GetEngCost <- function(size, eng_cost_pert, eng_hour_priors){
mode <- as.numeric(eng_cost_pert[["mode"]])
low <- as.numeric(eng_cost_pert[["low"]])
high <- as.numeric(eng_cost_pert[["high"]])
mode_hrs <- as.numeric(eng_hour_priors[["mode"]])
low_hrs <- as.numeric(eng_hour_priors[["low"]])
high_hrs <- as.numeric(eng_hour_priors[["high"]])
eng_rate <- rpert(size, min = low, mode = mode, max = high, shape = 4)
eng_hours <- rpert(size, min = low_hrs, mode = mode_hrs, max = high_hrs, shape = 4)
eng_cost <- eng_hours * eng_rate
return(eng_cost)
}
GetErrors <- function(n_draws, n_events, proportion_events){
n_errors <- rbinom(n = n_draws, size = n_events, prob = proportion_events)
return(n_errors)
}
GetPropotionOfEvents <- function(n_draws,prior_events){
error_beliefs <- GetBeliefsEvents(prior_events[["median_errors"]], prior_events[["edge_errors"]])
#Make an n_draws length list of error beliefs
proportion_events <- rbeta(as.numeric(n_draws), as.numeric(error_beliefs[[1]]) , as.numeric(error_beliefs[[2]]) )
return(proportion_events)
}
GetBeliefsHits <- function(hits, max_hits, events ){
alpha_ratio <- as.numeric(hits) / as.numeric(events)
beta_ratio <- as.numeric(max_hits) / as.numeric(events)
q1 = list(p=.5, x= alpha_ratio)
q2 = list(p=.9, x= beta_ratio)
return( beta.select(q1, q2) )
}
GetBeliefsEvents <- function(belief, max_belief ){
alpha_ratio <- as.numeric(belief)
beta_ratio <- as.numeric(max_belief)
q1 = list(p=.5, x= alpha_ratio)
q2 = list(p=.9, x= beta_ratio)
return( beta.select(q1, q2) )
}
SimulateHits <- function(freq_hits, num_events){
freq_miss <- 1 - freq_hits
res <- sample(c("hit", "miss"), size=num_events, replace=TRUE,
prob=c(freq_hits, freq_miss))
hit <- sum(res == "hit")
miss <- sum(res == "miss")
return( c(hit,miss))
}
GetBetaCI <- function(hit,edge){
betas <- GetBeliefsEvents(hit, edge)
intervals <- qbeta(c(.025, .5, .975), betas[1], betas[2])
intervals <- round(intervals * 100)
return(intervals)
}
MakeCurves <- function(alpha = 0, beta = 0 , color = "black", ytop = 20, CI = FALSE, CIT = FALSE, add_val = FALSE){
alpha <- alpha + 1
beta <- beta + 1
curve(dbeta(x, alpha , beta ), from=0, to=1,
xlab="SME Forecast For Proportion of Events", ylab="Density",
col= color, lwd=4, las=1, add = add_val , ylim = c(0,ytop))
if( CI ){
intervals <- qbeta(c(.025, .975), alpha, beta)
abline(v = intervals[1], lty=2)
abline(v = intervals[2], lty=2)
if( CIT ){
adj <- 0
if( intervals[2] - intervals[1] < 0.11 ){
adj <- 1
}
text(intervals[1] + 0.03,ytop, round(intervals[1], 2))
text(intervals[2] + 0.03 ,ytop - adj, round(intervals[2], 2))
}
}
}
ggplotBeta <- function(alpha, beta, xlab = xval, ylab = yval, tlab = "Beliefs", sval = "Test"){
x = seq(0, 1, 0.01)
y = dbeta(x, shape1=alpha, shape2=beta)
density = data.frame(x=x, y=y)
mean=alpha/(alpha+beta)
mode=(alpha-1)/(alpha+beta-2)
intervals <- qbeta(c(.025, .975), alpha, beta)
res <- ggplot(density, aes(x, y)) +
geom_area(fill= "darkslateblue", alpha = 0.6) +
theme_minimal()+
theme(axis.text.x = element_text(angle = 90, hjust = 1),
axis.text.y = element_blank()) +
geom_vline(xintercept = intervals[1]) +
geom_vline(xintercept = intervals[2])+
labs(x = xlab, y = ylab, title = tlab, subtitle = sval)
return(res)
}
GetSize <- function(size, first, second){
res <- ifelse(length(first) >= length(second), length(second), length(first) )
if( size > res){
return(res)
}
return(size)
}
GetEngHours <- function(size, med = 10, ninety = 60, total_hours = 100){
beta_vals <- GetBeliefsHits(med, ninety, total_hours)
prop_hours <- rbeta(size, beta_vals[1], beta_vals[2])
hours <- rbinom(size, total_hours, prop_hours)
return(hours)
}
#' ##Wrapper Function
#' This just makes it easy for users to call one function without much fuss.
runProductCostComparison <- function(base_configs, event_priors, event_counts, eng_hour_range, eng_cost_range, product_costs){
# A Quick Check for missed false negatives between products and other sundry.
update_count <- ErrorCheck(base_configs, event_counts)
if( update_count[2] > 0){
event_counts[update_count[[1]]] <- as.numeric(update_count[[2]])
}
#' ##Prior Beliefs About Error Rates
#' In the *event_priors* config element above you specified what you think is the likely error rate given N events. This graph shows you"
#' the spread of possible rates given our beliefs. It also gives some indication of where the most likely rates may cluster."
CatPriorBeliefRates(event_priors)
#Number of simulations (draws), tests (events), and default array index sizes
n_draws <- as.numeric(base_configs[["n_draws"]])
a_n_events <- as.numeric(base_configs[["a_n_events"]])
b_n_events <- as.numeric(base_configs[["b_n_events"]])
size <- 10000
#Create a list of simmulated events and error RATES based on priors
proportion_events <- GetPropotionOfEvents(n_draws, event_priors)
a_n_errors <- GetErrors(n_draws, a_n_events, proportion_events)
b_n_errors <- GetErrors(n_draws, b_n_events, proportion_events)
#Put simulations into dataframe
a_prior <- data.frame(proportion_events, a_n_errors)
b_prior <- data.frame(proportion_events, b_n_errors)
# Select the simulationed prior data that correlates to the real product errors
product_a <- a_prior[a_prior$a_n_errors == event_counts[["product_a_errors"]], ]
product_b <- b_prior[b_prior$b_n_errors == event_counts[["product_b_errors"]], ]
#Sample to create indices for posterior and load into data frame
sample_a_indices <- sample( nrow(product_a), size = size, replace = TRUE, prob = product_a$proportion_events)
sample_b_indices <- sample( nrow(product_b), size = size, replace = TRUE, prob = product_b$proportion_events)
posterior <- data.frame(
prod_a = product_a$proportion_events[sample_a_indices],
prod_b = product_b$proportion_events[sample_b_indices])
#' Updated beliefs given real error rates
CatUpdatedBeliefRates(posterior, event_counts, base_configs)
#' Create and Print out Error Rate Graphs
GraphErrorRates(posterior)
#' Calculate the rate differences, graph and print out
posterior$prop_diff_events <- posterior$prod_a - posterior$prod_b
GraphErrorDiff(posterior)
CatErrorDiffProp(posterior)
#This gets our impact per event with lots of uncertainty packed in
eng_cost <- GetEngCost(size, eng_cost_range, eng_hour_range)
# Add forecasted expected cost column to posterior for product a and b and store in list
posterior$prod_a_impact <- posterior$prod_a * eng_cost
posterior$prod_b_impact <- posterior$prod_b * eng_cost
#Load Median values into list of use later
impacts <- list()
impacts["prod_a_impact"] <- median(posterior$prod_a_impact)
impacts["prod_b_impact"] <- median(posterior$prod_b_impact)
# Add the column that holds the difference between the two costs
posterior$prod_diff_impact <- posterior$prod_a_impact - posterior$prod_b_impact
print("Here is a glimpse into some of the data.")
print(head(posterior,10))
total_costs <- GetTotalCosts(impacts, product_costs, base_configs["time_multiple"], event_counts, event_priors, base_configs)
#' #Total Costs Differences
impact_res <- CatTotalCost(total_costs, product_costs)
#' Graph Impact Difference
GraphErrorImpact(posterior, impact_res)
}
#' ## Run Program, Get Graphs, Data Output and The Code
#' This will spit out analytics and some graphs.
runProductCostComparison(base_configs,event_priors, event_counts, eng_hour_range, eng_cost_range, product_costs)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment