Skip to content

Instantly share code, notes, and snippets.

@bvancil
Last active July 10, 2021 16:28
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save bvancil/9468d2620821dfe4002f850494042e2f to your computer and use it in GitHub Desktop.
Save bvancil/9468d2620821dfe4002f850494042e2f to your computer and use it in GitHub Desktop.
a-telescoping discrete distribution
#' Generate random numbers from an \eqn{a}-telescoping
#' discrete distribution
#'
#' Let \eqn{a(n)} be a sequence (a function defined on the
#' natural numbers) that is non-decreasing and such that
#' \eqn{a(0)=1}. Then the a-telescoping discrete
#' distribution has a probability mass function (PMF) given
#' by
#'
#' \deqn{p(n) = \frac{1}{a(n)} - \frac{1}{a(n+1)}}
#'
#' Samples are generated from the complementary cumulative
#' distribution, since, if \eqn{X} is \eqn{a}-telescoping,
#'
#' \deqn{P(X > m) = \frac{1}{a(m + 1)},}
#'
#' Therefore, if we draw \eqn{p} from
#' \eqn{\text{Unif}(0,1)}, then we search for an \eqn{n}
#' such that
#'
#' \deqn{\frac{1}{a(n+1)} < p \le \frac{1}{a(n)}.}
#'
#' The user is expected to check that `a` and `a_inv` are
#' inverse functions and appropriately normalized.
#'
#' @param n integer, number of observations (default: 1L)
#' @param a function of a single natural number (default:
#' `NULL` indicates that we should use
#' \code{\link[base]{identity}} so that \eqn{a(n) = n})
#' @param a_inv inverse function to `a`; (default: `NULL`,
#' indicates that we should use the non-decreasing
#' property to search for the inverse. Warning: This is
#' slow. For the special case that `a` is identical to the
#' default, we cheat and provide our own inverse.)
#' @param memoise logical, whether to memoise `a` (default:
#' `FALSE`) using the package
#'
#' This source code gist is licensed according to
#' <https://www.gnu.org/licenses/gpl-3.0.html>.
#'
#' @return \eqn{n} natural numbers sampled from the
#' \eqn{a}-telescoping discrete distribution
#' @export
#'
#' @examples
#' if (interactive()) {
#' graphics::hist(base::log(rtele(1e5L)))
#' utils::head(base::proportions(base::table(rtele(1e5L))))
#' pow <- function(exponent) {
#' force(exponent)
#' function(b) {
#' b^exponent
#' }
#' }
#' rtele(100L, a = pow(1/5), a_inv = pow(5))
#' # Once the numbers get large enough, they become doubles
#' rtele(100L, a = pow(1/10), a_inv = pow(10))
#' }
rtele <- function(n = 1L, a = NULL, a_inv = NULL, memoise = FALSE) {
# Handle default `a`
if (base::is.null(a)) {
if (!base::is.null(a_inv)) {
base::stop("You have provided `a_inv` without `a`.")
}
a <- base::identity
a_inv <- base::identity
}
if (base::isTRUE(memoise)) {
if (base::require("memoise")) {
a_old <- a
a <- memoise::memoise(a_old)
}
}
# Handle default `a_inv`
if (base::is.null(a_inv)) {
# This is the slow way. Consider providing `a_inv`.
# This is a left or right inverse. I forget which.
a_inv <- function(xs) {
base::stopifnot(base::all(1L <= xs))
(base::Vectorize(function(x) {
n <- 0L
while (a(n <- n + 1L) < x) next
return(n)
}))(xs)
}
}
ps <- stats::runif(n = n)
xs <- 1 / ps
samples <- base::floor(a_inv(xs))
# If we overflow, just return the floats
samples <- base::tryCatch(base::as.integer(samples), warning = function(w) samples)
return(samples)
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment