ribsy/ABTestingGenerativeModel.R

## ABTestingGenerativeModel.R


#' ---
#' title: 3 Metrics For Getting Better Security ROI
#' author: Richard Seiersen
#' date: June 4, 2020
#' output:
#'    html_document:
#'      toc: true
#'      highlight: zenburn
#' ---


#' ##Why did I do this?
#' To answer this question in whole start here: [Soluble Blog: Security Budgets At Risk](www.soluble.ai/blogs/security-budgets-at-risk)
#'
#' Also, a fellow startup founder was having trouble demonstrating value in POCs. He wanted a way to more concretely
#' demonstrate product effectiveness in terms of overall costs impact. This is the result of that experiment.
#'
#' I make no claims about its absolute effectivness. But, I do believe it beats the competing model which is
#' nine time out of ten you unaided intuition. What I mean is, while this tool will not answer the question,
#' "should I buy this solution or not!?" It will move you in the direction of a mathematically unambigous and
#' consistent approach to comparing some products.
#'
#' ##Point Of View for this Demo
#' 1. When figuring out which security solution to buy you need to consider sticker costs and actual cost.
#' 2. Tools that do the right thing may produce cost - but I consider that the cost of doing business.
#' 3. Tools that make errors create waste in the form of extra re-work and risk (in the case of security)
#' 4. We want to measure all of that - but we emphasize **waste measurement** in this excercise.
#' 5. And we want to do that simply and relatively fast with small data.
#' 6. This script allows you to compare two products under test so you can
#'    make a more informed buying choice...even if the test size is tiny.
#' 7. [See Soluble blog for more](www.soluble.ai/blogs/security-budgets-at-risk)
#'
#'
#' ###Software You Will Need
#' * [You need R](https://www.r-project.org/)
#' * [R Studio](https://rstudio.com/products/rstudio/download/)
#'
#' ###Libraries You Will Need
#' There are generally two classes of libraries in use: Bayesian and Visualization. I am not necessarily endorsing any of these,
#' they just got the job done.
#' <br>
#' * Bayes: ProbBayes, Bolstad, mc2d (sorta), bayestestR.
#' <br>
#' * Visual: ggplot2, ggdistribute, scales, knitr
#' <br>

# Specify packages, and install missing ones.
my_packages <- c("ProbBayes", "Bolstad", "ggplot2", "ggdistribute",
                 "mc2d", "bayestestR", "scales", "knitr")
not_installed <- my_packages[!(my_packages %in% installed.packages()[ , "Package"])]

if (length(not_installed)){
  msgVal = paste("Install packages: ",toString(not_installed))

  msgRes <- askYesNo(msgVal)
  if( msgRes != TRUE){
      stopMsg <- paste("Quiting because you don't want to install:", not_installed)
      stop(stopMsg)
  }
}
if(length(not_installed)) install.packages(not_installed)

library(ProbBayes)
library(Bolstad)
library(ggplot2)
library(ggdistribute)
library(mc2d)
library(bayestestR)
library(scales)
library(knitr)

#'
#' Run this to get an html version: rmarkdown::render("generative_model_blog.R",output_file = "generative_model_blog_report.html")
#'
#' ##Configuration Elements
#' ### These are the only things you need to play with to run an anaysis.
#'
#' You edit these for your particular environment and test size.
#' You do not need to understand any of the code toward the bottom of this page. That would likely be torture
#' Your job is to change these parameters to fit your particular use case.
#'
#' ##Base Configs product A and product B defined
#' * Events (a_n_events/b_n_events) consists of errors and non-errors. It's the total set
#' * An error could be a false positive or a false negative as described in the article
#' * These break down into a few **implied** sets.  I say implied because you are only
#' tracking the total events and the errors (false positives and negatives)
#'     + Shared set of TRUE events
#'     + Shared set of False Positives
#'     + Shared set of False Negatives (discovered by some other solution)
#'     + False Positive unique to one product
#'     + False Negatives unique to one product
#' * IF the sum total events from one product are less than the TRUE events of another product
#' the script will throw an error. It means you missed one or more false negatives. That is the only
#' error checking that really matters. You could have countless false positives for one product and not
#' the other.
# These parameters set the basis for the analyssis in terms of simulations, events and timeframe
base_configs <- list()
base_configs["n_draws"] <- 100000    # Number for simulations 100K default - no need to change really
base_configs["a_n_events"] <- 47      # Total events for test like 100 vulns, phising events etc - could be 10
base_configs["b_n_events"] <- 40      # Total events for test like 100 vulns, phising events etc - could be 10
base_configs["time_multiple"] <- 12  # Time period: If a month then use 12, if a day use 365

# Beliefs about error counts prior to testing
event_priors <- list()
event_priors["median_errors"] <- 0.20  # True rate is just as likely above/below this
event_priors["edge_errors"] <- 0.45    # 90% sure true value is below this

# What was the real count after testing. it's in relation to base_configs["n_events"] above.
# This includes false positives and discoverable false negatives (deltas found between the solutions)
# Note, the solutions also will correct for some of the false negative errors dynamically.
event_counts <- list()
event_counts["product_a_errors"] <- 14  # After conducting test, what are the real errors for prod a
event_counts["product_b_errors"] <- 7  # After conducting test, what are the real errors for prod b

# Forecasted range of error response hours that can happen per event
eng_hour_range <- list()
eng_hour_range["mode"] <- 3  # Expect hours
eng_hour_range["low"] <- 1   # Low hours
eng_hour_range["high"] <- 5 # Max hours

# Forecasted cost of responding to an error event for ONE HOUR
eng_cost_range <- list()
eng_cost_range["mode"] <- 600  # Expected cost per event
eng_cost_range["low"] <- 200    # Low end cost per event
eng_cost_range["high"] <- 2000  # High end cost per event

#Year 1 Deterministic Cost of Product A & B license or subscrption etc
product_costs <- list()
product_costs["prod_a_cost"] <- 72000.00
product_costs["prod_b_cost"] <- 65000.00
product_costs["prod_a_run"] <- 10000      # There could be a material cost to operate product A
product_costs["prod_b_run"] <- 10000      # Perhaps product B is less expensive becuase its SaaS


#' ##Simplifying Functions
#' This are tiny functions that do computations, graphs or print data to screen.
#'

# Check error inputs and quit if not correct
ErrorCheck <- function(base_configs, event_counts){

  a_total <- as.numeric(base_configs["a_n_events"])
  a_errors <- as.numeric(event_counts["product_a_errors"])
  b_total <- as.numeric(base_configs["b_n_events"])
  b_errors <- as.numeric(event_counts["product_b_errors"])

  #Remove errors to get on true events
  a_true <- a_total - a_errors
  b_true <- b_total - b_errors
  #cat(paste("A Total:",a_total,"A Errors:", a_errors,"A True:", a_true, "\n"))
  #cat(paste("B Total:",b_total,"A Errors:", b_errors,"B True:", b_true, "\n"))

  #Is Prod A/B Missing errors - false negatives specifically?
  if( a_true < b_true ){
      diff <- b_true - a_true
      cat(paste("Adding ", diff ," FALSE NEGATIVE errors to product A for a new total of: ",diff + a_errors,"\n"))

      return(c("product_a_errors", diff + a_errors))
  }

  if( b_true < a_true ){
    diff <- a_true - b_true
    cat(paste("Adding ", diff ," FALSE NEGATIVE errors to product B for a new total of: ",diff + b_errors,"\n"))
    return(c("product_b_errors", diff + b_errors))
  }
  if( b_total < a_true ){
      diff <- a_true - b_total
      cat(paste("Product B is missing", diff, "false positive errors."))
      stop("Exiting")
  }
  if( a_total < a_errors){
      stop("Exiting: Your errors for product A exceed your total events for product A")
  }
  if( b_total < b_errors){
    stop("Exiting: Your errors for product B exceed your total events for product B")
  }
  return(c("No Change", 0))
}

CatTotalCost <- function(total_costs, product_costs){

  #Get Cost Ratios
  prod_a_total_costs <- total_costs[1]
  prod_b_total_costs <- total_costs[2]
  prod_ratio <- round( prod_a_total_costs / prod_b_total_costs, 3) * 100
  prod_ratio <- paste(prod_ratio, "%", sep = "")

  #Format as dollars
  prod_a_orginal_cost <- dollar_format()(as.numeric(product_costs["prod_a_cost"]))
  prod_a_total_final <- dollar_format()(round(as.numeric(prod_a_total_costs),2))
  prod_b_orginal_cost <- dollar_format()(as.numeric(product_costs["prod_b_cost"]))
  prod_b_total_final <- dollar_format()(round(as.numeric(prod_b_total_costs),2))

  #Print Message to stdout

  cat("--------Which Product Cost Most To Operate Given Forecasted Errors-------\n")
  cat(paste(" Product A is", prod_ratio , "the total cost of product B\n"))
  cat(paste(" Product A (with base price of ", prod_a_orginal_cost ,") is expected to cost to operate with errors: ", prod_a_total_final,"\n"))
  cat(paste(" Product B (with base price of ", prod_b_orginal_cost,") is expected to cost to operate with errors: ", prod_b_total_final,"\n"))
  cat("\n Still uncertain? Get more data by running more tests!")

  return(c(prod_ratio, prod_a_orginal_cost, prod_a_total_final, prod_b_orginal_cost, prod_b_total_final))
  #print(knitr::kable(out))
}

GraphErrorImpact <- function(posterior, impact){
  newdf_a_impact <- data.frame(posterior = posterior$prod_a_impact, product = "Product A")
  newdf_b_impact <- data.frame(posterior = posterior$prod_b_impact, product = "Product B")
  newdf_ab_impact <- rbind(newdf_a_impact, newdf_b_impact)
  impact_text <- paste(" Product A is", impact[1] , "the total cost of product B\n Product A (with base price of ", impact[2] ,") is expected to cost to operate with errors: ", impact[3],"\n Product B (with base price of ", impact[4],") is expected to cost to operate with errors: ", impact[5],"\n")


  print(ggplot(newdf_ab_impact, aes(x = posterior, y = ..density.., fill = product)) +
          stat_density_ci(alpha = 0.5, n = 1024, geom = "density",position = "identity") +
          scale_fill_manual(values=c("#333399", "#CCCCFF"))  +
          labs(title = "FORECASTING APPSEC ERROR IMPACT\nUsing Bayesian A/B Testing\n", x="Using Small Samples: Predicted Difference In Product Error Impacts",
               subtitle = impact_text,
               y = "Product Error Comparisons"))
}

CatErrorDiffProp <- function(posterior){
  error_rate_diff <- sum(posterior$prop_diff_events > 0 )/length(posterior$prop_diff_events)
  cat(paste("--------Error Rate Difference For This Simmulation--------\n Probability that product A makes more errors that Product B is ",round(error_rate_diff * 100) ,"%\n\n", sep = ""))
}

GraphErrorDiff <- function(posterior){

  xval <- posterior$prop_diff_events
  breaks <- pretty(range(xval), n = nclass.FD(xval), min.n = 1)
  bwidth <- breaks[2]-breaks[1]
  print(ggplot(posterior, aes(x=prop_diff_events)) +
          geom_histogram(aes(y=..density..), colour="black", fill="white", binwidth = bwidth)+
          geom_density(alpha=.2, fill="#333399") + labs(title="Event Amount Difference: Product A - Product B",x="Event Rate", y = "Amount")+
          geom_vline(xintercept=0, color="red", size=1, linetype = "dotted")
  )
}

GraphErrorRates <- function(posterior){

  newdf_a <- data.frame(posterior = posterior$prod_a, product = "Product A")
  newdf_b <- data.frame(posterior = posterior$prod_b, product = "Product B")
  newdf_ab <- rbind(newdf_a, newdf_b)

  print(ggplot(newdf_ab, aes(x = posterior, y = ..density.., fill = product)) +
          stat_density_ci(alpha = 0.5, n = 1024, geom = "density",position = "identity") +
          scale_fill_manual(values=c("#333399", "#CCCCFF"))  +
          labs(title = "Product A/B Testing", x="Using Small Samples: Predicted Difference In Product Error Rates",
               y = "Product Comparisons"))
}

CatUpdatedBeliefRates <- function(posterior, event_counts, base_configs){

  out <- "--------Updated Forecasts Given Small Data, Beliefs & Lots of Simulations--------------\n"

  # 89% Highest Density Intervals
  prod_a_points <- point_estimate(posterior$prod_a)
  ci_hdi_prod_a <- ci(posterior$prod_a, method = "HDI")
  out <- paste(out, " The previous product A forecast combined with ", base_configs["a_n_events"] ," events and ", event_counts["product_a_errors"] ," errors is between:",
               round(ci_hdi_prod_a$CI_low,2) * 100, "% and ",
               round(ci_hdi_prod_a$CI_high,2) * 100, "% \n    And centered near:",
               round(prod_a_points[[1]],2) * 100, "%\n\n", sep="")

  # 89% Highest Density Interval
  prod_b_points <- point_estimate(posterior$prod_b)
  ci_hdi_prod_b <- ci(posterior$prod_b, method = "HDI")
  out <- paste(out, " The previous product B foreacst combined with ", base_configs["b_n_events"] ," events and ",  event_counts["product_b_errors"]," errors is between:",
               round(ci_hdi_prod_b$CI_low,2) * 100, "% and ",
               round(ci_hdi_prod_b$CI_high,2) * 100, "% \n    And centered near:",
               round(prod_b_points[[1]],2) * 100, "%\n\n", sep="")
  out <- paste(out, "See the markdown report or plot tab graphs in rstudio to see forecasts combined with data.\n")
  cat(out)
}

CatPriorBeliefRates <- function(event_priors){

  #Pring Beta Curve
  #print(MakeCurves(event_priors[["median_errors"]], event_priors[["edge_errors"]],color="Purple", CI=TRUE, CIT=TRUE))
  xval = "Event Probability"
  yval = "Event Strength"
  tval = "Credible Beliefs About Events"
  sval = "With 95% Credible Belief Interval"
  beta_vals <- GetBeliefsEvents(event_priors[["median_errors"]], event_priors[["edge_errors"]] )

  print(ggplotBeta(beta_vals[1],  beta_vals[2],  xlab = xval, ylab = yval, tlab = tval, sval = sval))

  #Get Beta Prio CIs
  priorCI <- GetBetaCI(event_priors["median_errors"], event_priors["edge_errors"])

  cat("------Prior Beliefs About Error Rates--------\n")
  cat(paste(" You forecasted an ERROR RATE equally ABOVE/BELOW ", round(event_priors[["median_errors"]]*100), "%.\n", sep=""))
  cat(paste(" You're also 90% sure the error rate is LESS THAN ", round(event_priors[["edge_errors"]]*100), "%.\n", sep=""))
  cat(paste(" Given that forecast and the model we are using you you believe the true rate likely sits between ", priorCI[1],"% and ", priorCI[3], "%\n\n", sep=""))

}

GetTotalCosts <- function(impacts, costs, time_multiple, event_counts, prior_events, base_configs){

  product_a_counts <- event_counts[[1]]
  product_b_counts <- event_counts[[2]]

  # Get Gamma Prior for product a which will get fed into poisson dist in GetEventFrequency()
  a_number_events <- as.numeric(base_configs$a_n_events)
  a_gamma_median_prior <- prior_events[[1]] * a_number_events
  a_gamma_sd <- prior_events[[2]] * a_number_events

  # Get Gamma Prior for product b which will get fed into poisson dist in GetEventFrequency()
  b_number_events <- as.numeric(base_configs$b_n_events)
  b_gamma_median_prior <- prior_events[[1]] * b_number_events
  b_gamma_sd <- prior_events[[2]] * b_number_events

  prod_a_event_results <- GetEventFrequency(product_a_counts, a_gamma_median_prior, a_gamma_sd)
  prod_a_mean_events <- prod_a_event_results$mean
  prod_a_event_totals <- prod_a_mean_events * as.numeric(time_multiple)

  prod_b_event_results <- GetEventFrequency(product_b_counts, b_gamma_median_prior, b_gamma_sd)
  prod_b_mean_events <- prod_b_event_results$mean
  prod_b_event_totals <- prod_b_mean_events * as.numeric(time_multiple)

  prod_a_total_cost <- prod_a_event_totals * impacts$prod_a_impact + costs$prod_a_cost + costs$prod_a_run
  prod_b_total_cost <- prod_b_event_totals * impacts$prod_b_impact + costs$prod_b_cost + costs$prod_b_run

  return(c(prod_a_total_cost, prod_b_total_cost) )
}

GetEventFrequency <- function(event_counts, avg_event_count, event_variation, graph = FALSE){
  #rate vals can be a vector of results, perhaps events per day, week , month etc
  # mode and sd will be converted into rate and scale.
  event_count_priors <- GetRatePriors(avg_event_count, event_variation, graph)

  result <- poisgamp(event_counts, event_count_priors[1], event_count_priors[2], plot = FALSE, suppressOutput = TRUE)

  return(result)
}

GetRatePriors <- function(mode, sd, graph = FALSE){
  # This function allows you to think in terms of mode and SD.
  # Note that everything is positive..so while the mode might be a small number that is close to 0
  # The SD can be large skewing far to the right, without going past zero on the left.

  # Here are the corresponding rate and shape parameter values:
  ra = ( mode + sqrt( mode^2 + 4*sd^2 ) ) / ( 2 * sd^2 )
  sh = 1 + mode * ra

  if(graph){
    x = seq(0,mode+5*sd,len=1001)
    plot( x , dgamma( x , shape=sh , rate=ra ) , type="l" ,
          main=paste("dgamma, mode=",mode,", sd=",sd,sep="") ,
          ylab=paste("dgamma( shape=",signif(sh,3)," , rate=",signif(ra,3)," )",
                     sep="") )
    abline( v=mode , lty="dotted" )
  }

  return(c(sh, ra))
}

GetEngCost <- function(size, eng_cost_pert, eng_hour_priors){

  mode <- as.numeric(eng_cost_pert[["mode"]])
  low  <- as.numeric(eng_cost_pert[["low"]])
  high <- as.numeric(eng_cost_pert[["high"]])

  mode_hrs <- as.numeric(eng_hour_priors[["mode"]])
  low_hrs <- as.numeric(eng_hour_priors[["low"]])
  high_hrs <- as.numeric(eng_hour_priors[["high"]])

  eng_rate <- rpert(size, min = low, mode = mode, max = high, shape = 4)
  eng_hours <- rpert(size, min = low_hrs, mode = mode_hrs, max = high_hrs, shape = 4)
  eng_cost <- eng_hours * eng_rate

  return(eng_cost)
}


GetErrors <- function(n_draws, n_events, proportion_events){
  n_errors <- rbinom(n = n_draws, size = n_events, prob = proportion_events)
  return(n_errors)
}

GetPropotionOfEvents <- function(n_draws,prior_events){
  error_beliefs <- GetBeliefsEvents(prior_events[["median_errors"]], prior_events[["edge_errors"]])
  #Make an n_draws length list of error beliefs
  proportion_events <- rbeta(as.numeric(n_draws), as.numeric(error_beliefs[[1]]) , as.numeric(error_beliefs[[2]]) )
  return(proportion_events)
}

GetBeliefsHits <- function(hits, max_hits, events ){

  alpha_ratio <-  as.numeric(hits) / as.numeric(events)
  beta_ratio  <- as.numeric(max_hits) / as.numeric(events)
  q1 = list(p=.5, x= alpha_ratio)
  q2 = list(p=.9, x= beta_ratio)
  return( beta.select(q1, q2) )
}

GetBeliefsEvents <- function(belief, max_belief ){
  alpha_ratio <- as.numeric(belief)
  beta_ratio  <- as.numeric(max_belief)
  q1 = list(p=.5, x= alpha_ratio)
  q2 = list(p=.9, x= beta_ratio)
  return( beta.select(q1, q2) )
}

SimulateHits <- function(freq_hits, num_events){
  freq_miss <- 1 - freq_hits
  res <- sample(c("hit", "miss"), size=num_events, replace=TRUE,
                prob=c(freq_hits, freq_miss))
  hit <- sum(res == "hit")
  miss <- sum(res ==  "miss")
  return( c(hit,miss))
}

GetBetaCI <- function(hit,edge){
  betas <- GetBeliefsEvents(hit, edge)
  intervals <- qbeta(c(.025, .5, .975), betas[1], betas[2])
  intervals <- round(intervals * 100)
  return(intervals)
}

MakeCurves <- function(alpha = 0, beta = 0 , color = "black", ytop = 20, CI = FALSE, CIT = FALSE, add_val = FALSE){
  alpha <- alpha + 1
  beta <- beta + 1
  curve(dbeta(x, alpha , beta ), from=0, to=1,
        xlab="SME Forecast For Proportion of Events", ylab="Density",
        col= color, lwd=4, las=1, add = add_val , ylim = c(0,ytop))

  if( CI ){
    intervals <- qbeta(c(.025, .975), alpha, beta)
    abline(v = intervals[1], lty=2)
    abline(v = intervals[2], lty=2)
    if( CIT ){
      adj <- 0
      if( intervals[2] - intervals[1] < 0.11 ){
        adj <- 1
      }
      text(intervals[1] + 0.03,ytop, round(intervals[1], 2))
      text(intervals[2] + 0.03 ,ytop - adj, round(intervals[2], 2))
    }
  }
}

ggplotBeta <- function(alpha, beta, xlab = xval, ylab = yval, tlab = "Beliefs", sval = "Test"){

  x = seq(0, 1, 0.01)
  y = dbeta(x, shape1=alpha, shape2=beta)
  density = data.frame(x=x, y=y)
  mean=alpha/(alpha+beta)
  mode=(alpha-1)/(alpha+beta-2)
  intervals <- qbeta(c(.025, .975), alpha, beta)

  res <- ggplot(density, aes(x, y)) +
    geom_area(fill= "darkslateblue", alpha = 0.6) +
    theme_minimal()+
    theme(axis.text.x = element_text(angle = 90, hjust = 1),
          axis.text.y = element_blank()) +
    geom_vline(xintercept = intervals[1]) +
    geom_vline(xintercept = intervals[2])+
    labs(x = xlab, y = ylab, title = tlab, subtitle = sval)

  return(res)
}

GetSize <- function(size, first, second){
  res <- ifelse(length(first) >= length(second), length(second), length(first) )

  if( size > res){
    return(res)
  }
  return(size)
}

GetEngHours <- function(size, med = 10, ninety = 60, total_hours = 100){

  beta_vals <- GetBeliefsHits(med, ninety, total_hours)
  prop_hours <- rbeta(size, beta_vals[1], beta_vals[2])
  hours <- rbinom(size, total_hours, prop_hours)
  return(hours)
}

#' ##Wrapper Function
#' This just makes it easy for users to call one function without much fuss.
runProductCostComparison <- function(base_configs, event_priors, event_counts, eng_hour_range, eng_cost_range, product_costs){

  # A Quick Check for missed false negatives between products and other sundry.
  update_count <- ErrorCheck(base_configs, event_counts)
  if( update_count[2] > 0){
      event_counts[update_count[[1]]] <- as.numeric(update_count[[2]])
  }

  #' ##Prior Beliefs About Error Rates
  #' In the *event_priors* config element above you specified what you think is the likely error rate given N events. This graph shows you"
  #' the spread of possible rates given our beliefs. It also gives some indication of where the most likely rates may cluster."
  CatPriorBeliefRates(event_priors)

  #Number of simulations (draws), tests (events), and default array index sizes
  n_draws <- as.numeric(base_configs[["n_draws"]])
  a_n_events <- as.numeric(base_configs[["a_n_events"]])
  b_n_events <- as.numeric(base_configs[["b_n_events"]])
  size <- 10000

  #Create a list of simmulated events and error RATES based on priors
  proportion_events <- GetPropotionOfEvents(n_draws, event_priors)
  a_n_errors <- GetErrors(n_draws, a_n_events, proportion_events)
  b_n_errors <- GetErrors(n_draws, b_n_events, proportion_events)

  #Put simulations into dataframe
  a_prior <- data.frame(proportion_events, a_n_errors)
  b_prior <- data.frame(proportion_events, b_n_errors)

  # Select the simulationed prior data that correlates to the real product errors
  product_a <- a_prior[a_prior$a_n_errors == event_counts[["product_a_errors"]], ]
  product_b <- b_prior[b_prior$b_n_errors == event_counts[["product_b_errors"]], ]

  #Sample to create indices for posterior and load into data frame
  sample_a_indices <- sample( nrow(product_a), size = size, replace = TRUE, prob = product_a$proportion_events)
  sample_b_indices <- sample( nrow(product_b), size = size, replace = TRUE, prob = product_b$proportion_events)
  posterior <- data.frame(
    prod_a = product_a$proportion_events[sample_a_indices],
    prod_b  = product_b$proportion_events[sample_b_indices])

  #' Updated beliefs given real error rates
  CatUpdatedBeliefRates(posterior, event_counts, base_configs)

  #' Create and Print out Error Rate Graphs
  GraphErrorRates(posterior)

  #' Calculate the rate differences, graph and print out
  posterior$prop_diff_events <- posterior$prod_a - posterior$prod_b
  GraphErrorDiff(posterior)
  CatErrorDiffProp(posterior)

  #This gets our impact per event with lots of uncertainty packed in
  eng_cost <- GetEngCost(size, eng_cost_range, eng_hour_range)

  # Add forecasted expected cost column to posterior for product a and b and store in list
  posterior$prod_a_impact <- posterior$prod_a * eng_cost
  posterior$prod_b_impact <- posterior$prod_b * eng_cost

  #Load Median values into list of use later
  impacts <- list()
  impacts["prod_a_impact"] <- median(posterior$prod_a_impact)
  impacts["prod_b_impact"] <- median(posterior$prod_b_impact)

  # Add the column that holds the difference between the two costs
  posterior$prod_diff_impact <- posterior$prod_a_impact - posterior$prod_b_impact
  print("Here is a glimpse into some of the data.")
  print(head(posterior,10))
  total_costs <- GetTotalCosts(impacts, product_costs, base_configs["time_multiple"], event_counts, event_priors, base_configs)

  #' #Total Costs Differences
  impact_res <- CatTotalCost(total_costs, product_costs)

  #' Graph Impact Difference
  GraphErrorImpact(posterior, impact_res)
}

#' ## Run Program, Get Graphs, Data Output and The Code
#' This will spit out analytics and some graphs.
runProductCostComparison(base_configs,event_priors, event_counts, eng_hour_range, eng_cost_range, product_costs)