darius/gist:dedab692ca408113757c792ed60c91eb

## gistfile1.txt
withal@gmail.com
I have questions about some Common Lisp code. First, I'm unfamiliar with defsubst. Is it a standard macro?

Yes, defsubst is a standard macro in Common Lisp. It is similar to defun, which is used to define functions, but with a key difference: defsubst is used to define inline functions, also known as "substitute functions" or "substitutions".

When a function is defined with defsubst, the compiler is advised to expand the function at compile-time, essentially inlining the function's body wherever it is called. This can lead to performance improvements, as the function call overhead is eliminated. However, it may also increase the size of the compiled code.

Keep in mind that using defsubst is just a suggestion to the compiler, and it is not guaranteed to inline the function. The decision is ultimately up to the compiler's optimization settings and heuristics.
withal@gmail.com
Thanks. Now some of the function defs are preceded by, apparently a type declaration. I haven't seen this form of that in CL code before:

    (-> log-likelihood (double-float double-float double-float) double-float)
    (defun log-likelihood (μ σ datum)
      "Return the log-likelihood of DATUM for given model parameters."
      (values (if (plusp σ)
                  (gaussian-log-likelihood μ σ datum)
                  sb-ext:double-float-negative-infinity)))

The code you're seeing appears to be using the -> macro to declare the type of the function's arguments and return value. However, this is not a standard Common Lisp macro. It might be a custom macro defined in the codebase you're working with, or it could be part of a library that is being used.

The -> macro in this example is specifying that the function log-likelihood takes three arguments of type double-float and returns a value of type double-float.

It's important to understand that the -> macro is not part of the standard Common Lisp language. If you're working with a codebase that uses this macro, you'll need to locate its definition or the library it comes from to understand how it works and if it has any side effects.

In standard Common Lisp, you would use the declare form inside the function body to specify the types of the arguments and return values, like this:

lisp

(defun log-likelihood (μ σ datum)
  "Return the log-likelihood of DATUM for given model parameters."
  (declare (type double-float μ σ datum)
           (values double-float))
  (if (plusp σ)
      (gaussian-log-likelihood μ σ datum)
      sb-ext:double-float-negative-infinity))

This example uses the declare form to specify that the arguments μ, σ, and datum are of type double-float, and that the function returns a value of type double-float. Note that the values declaration is used to specify the return type.
withal@gmail.com
That's what I thought. The first two lines of the source file are the following. Could the (:use ...) part be related to importing the -> macro? `(defpackage #:permo (:use #:gt))
(in-package #:permo)`

Yes, it's possible that the (:use ...) part in the defpackage form is related to importing the -> macro. The defpackage form is used to define a new package, and the :use option is for specifying a list of packages whose external symbols should be imported into the new package.

In your example, the package being defined is named permo, and it's importing the symbols from the gt package:

lisp

(defpackage #:permo (:use #:gt))

If the -> macro is defined in the gt package and has been exported, it would be accessible in the permo package.

After defining the package, the code switches to the permo package using the in-package form:

lisp

(in-package #:permo)

This sets the current package to permo, making the symbols imported from the gt package, including the -> macro, available in the code that follows.
withal@gmail.com
This package apparently does sequential monte carlo estimation. Can you give me an overview of that?

Sequential Monte Carlo (SMC) estimation, also known as Particle Filtering, is a family of algorithms used for estimating the state of a dynamic system or time series data, especially when dealing with non-linear and non-Gaussian systems. SMC methods are particularly useful in situations where the system's state evolves over time and is subject to noise, and the observation model is also subject to noise.

The main idea behind SMC is to represent the system's state probability distribution using a set of particles (or samples) and update these particles over time as new observations become available. The algorithm consists of three main steps, which are repeated at each time step:

    Prediction: In this step, the particles are propagated forward in time using the known dynamic model of the system. This step usually involves adding process noise to account for the uncertainty in the system dynamics.

    Update: When a new observation is available, the particles are updated by computing their weights based on the likelihood of the observation given the particle's state. This step involves evaluating the observation model for each particle, which describes the probability of observing the data given the current state.

    Resampling: After updating the particle weights, some particles may have very low weights, and others may have very high weights. To avoid degeneracy, where only a few particles dominate the representation of the state distribution, a resampling step is performed. This step involves generating a new set of particles by drawing from the current set with replacement, with the probability of selecting a particle being proportional to its weight. This results in a new set of equally weighted particles that better represent the updated state distribution.

The main advantage of SMC methods is their ability to handle complex, non-linear, and non-Gaussian systems, as well as their ability to provide an estimate of the system state along with a measure of the associated uncertainty. However, SMC methods can be computationally expensive, especially for high-dimensional systems, due to the need to maintain and update a large number of particles.

Some common applications of SMC include target tracking, robot localization and mapping, and Bayesian inference in dynamic systems.
withal@gmail.com
When do people use the different terms SMC and particle filtering? Do the different ever imply technical differences?

Sequential Monte Carlo (SMC) and Particle Filtering are often used interchangeably in the literature, and both terms refer to the same family of algorithms. However, there are subtle differences in the contexts in which these terms are typically used:

    Particle Filtering: This term is more commonly used when discussing the application of SMC methods for state estimation in dynamic systems or time series data. The focus here is on the filtering aspect, which involves estimating the current state of a system given the history of observations up to the present time. In other words, Particle Filtering is often used when discussing the specific application of SMC algorithms to tracking, localization, or other state estimation problems.

    Sequential Monte Carlo: This term is more general and is used to describe the broader class of algorithms that use Monte Carlo methods for estimating the state of a system or solving other problems in a sequential manner. This includes not only the filtering problem but also other problems like smoothing (estimating the state of a system at a previous time) or prediction (estimating the state of a system at a future time). The term SMC is also used in the context of more general Bayesian inference problems, where the goal is to estimate the posterior distribution of parameters or latent variables given observed data.

While both terms refer to the same family of algorithms, the choice of term may imply a difference in focus or application. In practice, however, many researchers and practitioners use these terms interchangeably, and the specific technical details of the algorithms are more important than the choice of terminology.
withal@gmail.com
Great, that helps. This code seems to be meant for Bayesian inference. Could you try adding some helpful comments for someone who has only a vague idea of SMC (but a decent background in probability and algorithms) to these function definitions? Please leave out the body of the code -- just the def lines are enough to identify what these comments would be referring to. Code follows: ;;;; Types

;; Short-hand types.
;; Leaky abstractions: no secret what the concrete representations are.
(deftype R   (&rest args) "Real number."            `(double-float ,@args))
(deftype P   ()           "Probability."            '(R 0d0 1d0))
(deftype L   ()           "Log-likelihood."         '(R * #.(log 1)))

;;;; Models written by hand

(defun smc/gaussian (&optional
                       (n-particles 1000)
                       (observations (loop repeat 1000 collect (+ 1d0 (* 2d0 (gaussian-random))))
                                     #+nil (loop repeat 1000 collect 100d0)))
  (local
    ;; Particles are represented as indices into these arrays.
    (def μ (make-array (list n-particles) :element-type 'double-float))
    (def σ (make-array (list n-particles) :element-type 'double-float))

    (loop for i below n-particles
          do (setf (aref μ i) (- 10d0 (random 20d0))
                   (aref σ i) (random 10d0)))

    (-> log-likelihood (double-float double-float double-float) double-float)
    (defun log-likelihood (μ σ datum)
      "Return the log-likelihood of DATUM for given model parameters."
      (values (if (plusp σ)
                  (gaussian-log-likelihood μ σ datum)
                  sb-ext:double-float-negative-infinity)))

    (defun particle-log-likelihood (particle datum)
      "Return the log-likelihood of DATUM given the parameters of PARTICLE."
      (log-likelihood (aref μ particle) (aref σ particle) datum))

    (defun respawn! (parent-indices)
      "Re-initialize the particles by copying values from PARENT-INDICES.
       (PARENT-INDICES were chosen by resampling.)"
      (loop with μ₀ = (copy-array μ)
            with σ₀ = (copy-array σ)
            for i from 0
            for p in parent-indices
            do (setf (aref μ i) (aref μ₀ p)
                     (aref σ i) (aref σ₀ p))))

    (defun jitter! (metropolis-accept?)
      "Jitter (rejuvenate) the parameters of all particles using one or more metropolis steps.

       Function (METROPOLIS-ACCEPT? OLD PROPOSED) uses the supplied log-likelihood functions
       to accept or reject the proposed state."
      ;; Try a series of moves of different relative sizes.
      (loop for proposal-stddev in '(0.01d0 0.10d0 0.25d0) do
        ;; Add gaussian noise to both parameters. Scaled to a fraction of current value.
        (loop for i below n-particles
              for μ.i = (aref μ i)
              for σ.i = (aref σ i)
              for μ.p = (+ μ.i (* μ.i proposal-stddev (gaussian-random)))
              for σ.p = (+ σ.i (* σ.i proposal-stddev (gaussian-random)))
              for old-log-likelihood = (partial #'log-likelihood μ.i σ.i)
              for new-log-likelihood = (partial #'log-likelihood μ.p σ.p)
              when (funcall metropolis-accept? old-log-likelihood new-log-likelihood)
                do (setf (aref μ i) μ.p
                         (aref σ i) σ.p))))
    (values μ σ (smc/likelihood-tempering n-particles observations
                                          :log-likelihood #'particle-log-likelihood
                                          :respawn! #'respawn!
                                          :jitter! #'jitter!))))

;;;; Sequential Monte Carlo (Particle Filter) sampler

(defun smc (&key log-mean-likelihood resample! rejuvenate! step! weight!)
  "Run a callback-driven Sequential Monte Carlo particle filter simulation.
   Return the log marginal likelihood estimate.

   Callbacks:
   (WEIGHT!)
     Calculate and associate weights with particles.
   (LOG-MEAN-LIKELIHOOD) ⇒ double-float
     Return the log mean likelihood for all particles.
   (RESAMPLE!)
     Resample weighted particles into equally-weighted replacements.
   (REJUVENATE!)
     Jitter particles without changing their distribution.
   (STEP!) ⇒ boolean
     Advance the simulation. Return true on success, false on a completed simulation."
  (loop do (funcall weight!)
        sum (funcall log-mean-likelihood)
        while (funcall step!)
        do (funcall resample!)
           (funcall rejuvenate!)))

(defsubst smc/likelihood-tempering (n-particles observations
                                    &key
                                      log-likelihood respawn! jitter!
                                      (initial-temp 0.01d0)
                                      (temp-step 0.01d0))
  "Run a callback-driven Sequential Monte Carlo simulation using likelihood tempering.

   Callbacks:
   (PARTICLE-LOG-LIKELIHOOD particle datum) ⇒ log-likelihood
     Return log-likelihood of DATUM given the parameters of PARTICLE.
   (RESPAWN! parent-indices)
     Replace particles by overwriting them with the resampled PARENT-INDICES.
   (JITTER! metropolis-accept?)
     Rejuvenate particles by proposing moves to the function
       (METROPOLIS-ACCEPT? OLD-LOG-LIKELIHOOD-FN NEW-LOG-LIKELIHOOD-FN)
     which compares the likelihood ratios and returns true if the move is accepted."
  (local
    (def temp initial-temp)
    (def log-weights (make-array (list n-particles) :element-type 'double-float))

    (defun tempered-log-likelihood (log-likelihood-fn)
      "Return the log-likelihood tempered with the current temperature."
      (handler-case
          (loop for o in observations
                summing (funcall log-likelihood-fn o) into ll
                finally (return (logexpt ll temp)))
        #+nil (floating-point-overflow () sb-ext:double-float-negative-infinity)))
    (defun log-mean-likelihood ()
      "Return the log mean likelihood of all particles."
      (log/ (logsumexp log-weights) (log n-particles)))
    (defun weight! ()
      "Calculate and set the weight of each particle."
      (loop for i below n-particles
            do (setf (aref log-weights i)
                     (tempered-log-likelihood (partial log-likelihood i))))
      (format t "~&w = ~a~%" log-weights))
    (defun resample! ()
      "Replace old particles by resampling them into new ones."
      (funcall respawn! (systematic-resample log-weights)))
    (defun step! ()
      "Advance the tempering schedule (unless finished.)"
      (and (< temp 1d0)
           (setf temp (min 1d0 (+ temp temp-step)))))
    (defun metropolis-accept? (old-log-likelihood new-log-likelihood)
      (< (log (random 1d0)) (log/ (tempered-log-likelihood new-log-likelihood)
                                  (tempered-log-likelihood old-log-likelihood))))
    (defun rejuvenate! ()
      (funcall jitter! #'metropolis-accept?))

    (smc :log-mean-likelihood #'log-mean-likelihood
         :resample! #'resample!
         :rejuvenate! #'rejuvenate!
         :step! #'step!
         :weight! #'weight!)))

;; Helpers for arithmetic in the log domain.
;; Used sparingly when it helps to clarify intent.
(defsubst log/ (&rest numbers) (apply #'- numbers))
;;(defun log* (&rest numbers) (apply #'+ numbers))
(defsubst logexpt (a b)
  "Raise A (a log-domain number) to the power B (which is not log!)"
  (* a b))

;;;; Systematic resampling

(-> systematic-resample (vector) list)
(defun systematic-resample (log-weights)
  (systematic-resample/normalized (loop with normalizer = (logsumexp log-weights)
                                        for lw across log-weights
                                        collect (exp (- lw normalizer)))))

(-> systematic-resample/normalized (list) list)
(defun systematic-resample/normalized (weights)
  (loop with n = (length weights)
        with cdf = (coerce (loop for weight in weights
                                 sum weight into cumulative
                                 collect cumulative)
                           'vector)
        with index = 0
        repeat n
        ;; Pick a starting offset into the CDF
        for u = (random (/ 1d0 n)) then (+ u (/ 1d0 n))
        ;; Step forward through the CDF
        do (loop while (and (> u (aref cdf (min index (1- n))))) do (incf index))
           (minf index (1- n))
        collect (min index (1- n))))

;;;; Probability distributions and utilities

(defun gaussian-log-likelihood (μ σ x)
  "Return the likelihood of Normal(μ,σ) at X."
  (if (plusp σ)
      (let ((z (/ (- x μ) σ)))
        (- 0
           (log σ)
           (/ (+ (* z z) (log (* pi 2))) 2)))
      most-negative-double-float))

(-> logsumexp (R[]) R)
(defun logsumexp (vec)
  ;; utility for numerically stable addition of log quantities e.g. for normalization
  ;; see e.g. https://gregorygundersen.com/blog/2020/02/09/log-sum-exp/
  (loop with max = (reduce #'max vec)
        for x across vec
        summing (if (sb-ext:float-infinity-p x) 0d0 (exp (- x max))) into acc
        finally (return (+ max (log acc)))))

Here are comments for the functions you provided. These comments are intended to give a brief explanation of the purpose of each function for someone with a general understanding of SMC, probability, and algorithms:

lisp

;;; Types

;; ...

;;;; Models written by hand

;; Main entry point for a Gaussian model SMC simulation.
;; Returns μ, σ, and the log likelihoods for the given observations.
(defun smc/gaussian ...)

;; Calculate the log-likelihood of a datum given model parameters μ and σ.
(defun log-likelihood ...)

;; Calculate the log-likelihood of a datum given the parameters of a particle.
(defun particle-log-likelihood ...)

;; Re-initialize the particles by copying values from the given parent indices.
(defun respawn! ...)

;; Jitter (rejuvenate) the parameters of all particles using Metropolis steps.
(defun jitter! ...)

;;;; Sequential Monte Carlo (Particle Filter) sampler

;; Run a callback-driven SMC particle filter simulation, return log marginal likelihood estimate.
(defun smc ...)

;; Run a callback-driven SMC simulation using likelihood tempering.
(defun smc/likelihood-tempering ...)

;; Helper functions for arithmetic in the log domain
;; ...

;;;; Systematic resampling

;; Perform systematic resampling on a vector of log-weights.
(defun systematic-resample ...)

;; Perform systematic resampling on a list of normalized weights.
(defun systematic-resample/normalized ...)

;;;; Probability distributions and utilities

;; Return the log-likelihood of a Gaussian distribution with given μ, σ at point x.
(defun gaussian-log-likelihood ...)

;; Compute the numerically stable sum of exponentials of a vector of log-domain numbers.
(defun logsumexp ...)

These comments should help in understanding the general purpose of each function in the provided code.
withal@gmail.com
What's systematic about the resampling?

Systematic resampling is a method used in particle filtering and Sequential Monte Carlo (SMC) to resample a set of particles (also known as "samples" or "points") based on their importance weights. The "systematic" aspect refers to the structured way the resampling procedure is carried out.

In systematic resampling, the following steps are taken:

    Divide the unit interval [0, 1) into N equal segments, where N is the number of particles.
    Choose a random starting point, U, uniformly from the first segment (i.e., between 0 and 1/N).
    Compute the cumulative distribution function (CDF) of the importance weights.
    For each subsequent segment (from U + 1/N, U + 2/N, ..., U + (N-1)/N), find the index in the CDF that corresponds to the segment value. The particle associated with that index is selected for resampling.

The systematic resampling method ensures a more even coverage of the particles with high importance weights. It helps maintain diversity in the resampled particles and reduces the risk of sample impoverishment compared to other methods, such as multinomial resampling.

In the provided code, the systematic-resample function performs systematic resampling on a vector of log-weights, while the systematic-resample/normalized function performs systematic resampling on a list of normalized weights.
withal@gmail.com
Let's see if I understand. Critique this claim: `systematic-resample` takes a list of weights (in the log domain). Each weight is associated with a particle (i.e. weight #i goes with particle #i). This function generates a new list of particle indices, of the same length, but picking from the original particles at random with replacement, each with a chance proportional to its weight.

Your claim is almost correct, but there is a slight misunderstanding. The systematic-resample function takes a vector of log-weights (not a list) as an input. Other than that, your claim is accurate. The function generates a new list of particle indices of the same length, picking from the original particles at random with replacement, with each particle's chance of being selected proportional to its weight. The systematic resampling method ensures a more even coverage of particles with high importance weights and helps maintain diversity in the resampled particles.
withal@gmail.com
Oh yes, thanks. Spelled out in a similar style, what's the purpose of the likelihood tempering function? I guess it's a kind of simulated annealing.

Yes, you're correct. The likelihood tempering function is a variant of Sequential Monte Carlo (SMC) that incorporates a simulated annealing-like technique to improve the performance of the algorithm, particularly in high-dimensional and complex problems.

The purpose of the likelihood tempering function is to perform SMC sampling while gradually increasing the influence of the observed data on the particles. This is done by "tempering" the likelihood function using a temperature parameter. The temperature parameter starts at a small value (close to 0) and is incrementally increased towards 1 during the SMC process. When the temperature is close to 0, the particles explore the prior distribution, and as the temperature approaches 1, the particles are increasingly influenced by the observed data.

The likelihood tempering function aims to achieve the following:

    Facilitate the exploration of the particles in the parameter space by starting with a low influence of the observed data and increasing it gradually.
    Improve the efficiency and robustness of the SMC algorithm in high-dimensional or complex problems.
    Provide better estimates of the posterior distribution and reduce the risk of getting trapped in local optima.

In the provided code, the smc/likelihood-tempering function carries out the likelihood tempering process. It takes several callback functions as arguments, such as log-likelihood, respawn!, and jitter!, which are used to compute the tempered log-likelihood, resample the particles, and rejuvenate the particles, respectively. The function also accepts initial-temp and temp-step as optional arguments to control the starting temperature and the increment at each step.
withal@gmail.com
Do you think the purpose of smc/gaussian is then: you call it with a list of observations of a real random variable (according to your mental model). You get back three values: (1 and 2:) vectors of mu and sigma values for particles that are sampled from a certain distribution; and (3) some kind of 'goodness' of this distribution? Can this be explained further in Bayesian terms?

Yes, you are on the right track. The purpose of the smc/gaussian function is to perform Sequential Monte Carlo (SMC) sampling for a Gaussian model, given a list of observations of a real random variable. The goal is to approximate the posterior distribution of the model parameters (mu and sigma) given the observed data.

When you call the smc/gaussian function with a list of observations, it returns three values:

    A vector of mu values for the particles, which are sampled from the approximate posterior distribution of the mean (μ) of the Gaussian model.
    A vector of sigma values for the particles, which are sampled from the approximate posterior distribution of the standard deviation (σ) of the Gaussian model.
    The estimated log marginal likelihood (also known as the model evidence) of the observations given the Gaussian model.

In Bayesian terms, the function performs approximate inference to update our beliefs about the model parameters (mu and sigma) based on the observed data. The updated beliefs are represented by the particles, which approximate the posterior distribution p(μ, σ | data) of the Gaussian model. The log marginal likelihood is a measure of the goodness of the model in explaining the observed data. It is used for model comparison and can be interpreted as the probability of the data given the model, integrating out the model parameters:

log p(data | model) = log ∫ p(data | μ, σ, model) * p(μ, σ | model) dμ dσ

where p(data | μ, σ, model) is the likelihood, and p(μ, σ | model) is the prior distribution of the parameters.
withal@gmail.com
A couple of questions still. Your p(data | model), to pick a nit, would be a probability density? And did I miss noticing something like a prior distribution in the program?

You are correct, my apologies for the confusion. The term p(data | model) is indeed a probability density function (pdf) and not a probability, since the data are continuous. I appreciate the clarification.

As for the prior distribution, you didn't miss it – the code doesn't explicitly define a prior distribution for the parameters. The prior seems to be implicitly incorporated in the initial state of the particles (mu and sigma values), which are randomly generated at the beginning of the smc/gaussian function:

lisp

(loop for i below n-particles
      do (setf (aref μ i) (- 10d0 (random 20d0))
               (aref σ i) (random 10d0)))

In this case, the prior distribution for the mean (μ) is uniform over the range [-10, 10], and the prior distribution for the standard deviation (σ) is uniform over the range [0, 10]. This choice of prior distributions might not be ideal, and in practice, one might choose different prior distributions based on the domain knowledge or the problem context.
withal@gmail.com
Hm, thanks. Going back to the resampling function, I misunderstood it as simpler than it is. (I wish you would correct me more bluntly. My goal is to really understand this code.) First question: my memory of the CL loop macro is spotty. What would the cdf variable computed in this function be if weights is just, say, `#(1d0)`?

Apologies if my previous explanations were unclear, and I appreciate your desire to understand the code deeply. I'll address your questions more directly.

For the given input #(1d0), the cdf variable computed in the systematic-resample/normalized function would be #(1d0) as well. Let me explain how this happens:

Here is the relevant loop that computes the cdf:

lisp

(loop for weight in weights
      sum weight into cumulative
      collect cumulative)

Since weights is #(1d0), there's only one iteration in this loop. In this single iteration, the variable cumulative is initialized to 0 and then increased by the value of weight, which is 1d0. Then, this value (1d0) is collected and the loop ends, resulting in a vector #(1d0) as the cdf.

The purpose of this loop is to compute the cumulative distribution function (CDF) of the normalized weights. In this particular case, with only one weight in the input, the CDF will be the same as the input weights.
withal@gmail.com
OK, that checks. In the same function, what's the difference between min and minf?

Apologies for the confusion. minf is not a standard Common Lisp function, and it seems to be a typo in the code. The correct function to use here is min, which returns the smallest of the given arguments.

The code should be updated to use min instead of minf. Here's the corrected loop:

lisp

(loop with n = (length weights)
      with cdf = (coerce (loop for weight in weights
                               sum weight into cumulative
                               collect cumulative)
                         'vector)
      with index = 0
      repeat n
      ;; Pick a starting offset into the CDF
      for u = (random (/ 1d0 n)) then (+ u (/ 1d0 n))
      ;; Step forward through the CDF
      do (loop while (and (> u (aref cdf (min index (1- n))))) do (incf index))
         (min index (1- n))
      collect (min index (1- n)))

min is used here to make sure the index does not exceed the maximum allowed index (which is n-1), preventing any out-of-bounds access to the cdf vector.
withal@gmail.com
So this loop has a skeleton like `(loop ... repeat n for u = e1 then e2 do e3 e4 collect e4)`. Please summarize what a loop like that does at runtime.

A loop with the structure (loop ... repeat n for u = e1 then e2 do e3 e4 collect e4) will perform the following steps at runtime:

    Initialize the loop counter to repeat n times.
    Initialize the variable u with the value of expression e1 during the first iteration.
    Execute expressions e3 and e4 in each iteration.
    After the first iteration, update the value of u with the value of expression e2 for the remaining iterations.
    Collect the values of expression e4 after each iteration, accumulating them in a list.
    After completing n iterations, return the list of collected values.

In the context of the systematic-resample function, this loop structure generates a new list of particle indices. It does so by iterating n times (where n is the number of particles), computing a random offset u, stepping forward through the CDF, and collecting the new particle index based on the current position in the CDF.
withal@gmail.com
Is there a reason e4 appears twice? What happens if you write it like `(... do e3 collect e4)`?

I apologize for the confusion in my previous response. The loop structure should indeed be written as (loop ... repeat n for u = e1 then e2 do e3 collect e4).

Here's a summary of what this loop does at runtime:

    Initialize the loop counter to repeat n times.
    Initialize the variable u with the value of expression e1 during the first iteration.
    Execute expression e3 in each iteration.
    After the first iteration, update the value of u with the value of expression e2 for the remaining iterations.
    Collect the values of expression e4 after each iteration, accumulating them in a list.
    After completing n iterations, return the list of collected values.

In the systematic-resample function, this loop generates a new list of particle indices by iterating n times, computing a random offset u, stepping forward through the CDF, and collecting the new particle index based on the current position in the CDF. The previous response mistakenly included e4 twice, which was incorrect.
withal@gmail.com
At a high level, what's the idea of how the systematic resampling method differs from multinoulli resampling?

Systematic resampling and multinomial (also called multinoulli) resampling are two methods used in particle filtering for resampling particles based on their weights. Both methods aim to generate a new set of particles that focus on regions of higher importance (higher weights), but they differ in their approach and properties.

Multinomial resampling (Multinoulli):

In multinomial resampling, each particle is independently resampled from the original set with a probability proportional to its weight. This is similar to drawing particles from a discrete probability distribution where the probability mass for each particle is given by its normalized weight. The resampling process can be thought of as spinning a roulette wheel, where each particle occupies a slot proportional to its weight, and the wheel is spun N times to generate the new set of particles.

Systematic resampling:

Systematic resampling is a more deterministic approach. It generates a set of equally spaced sampling points in the range [0, 1/N), and selects particles based on which intervals these sampling points fall into when considering the cumulative density function (CDF) of the normalized weights. This method ensures a more uniform coverage of the weight space and can provide a better representation of the particle distribution compared to multinomial resampling.

The key differences between these methods are:

    Multinomial resampling is more stochastic, while systematic resampling has a more deterministic structure due to the equally spaced sampling points.
    Systematic resampling can result in a more uniform coverage of the weight space, which may provide a better representation of the particle distribution.
    Systematic resampling has a lower variance in the number of offspring particles compared to multinomial resampling, which can help avoid degeneracy issues in particle filtering.

Overall, both methods serve the same purpose of resampling particles based on their weights but differ in their approach and properties. Systematic resampling is often preferred due to its more uniform coverage and lower variance in offspring particles.
withal@gmail.com
Do you think there are any bugs in this implementation of systematic resampling that we're looking over?

The implementation of systematic resampling in the provided code appears to be correct. It follows the standard procedure for systematic resampling as follows:

    Normalize the weights and calculate the cumulative density function (CDF).
    Generate equally spaced sampling points in the range [0, 1/N).
    For each sampling point, find the corresponding index in the CDF where the point falls.
    Collect the indices and return them as the resampled particle indices.

The provided code correctly implements these steps using Common Lisp constructs like loop, coerce, and vector. The functions systematic-resample and systematic-resample/normalized work together to perform the resampling process.

It's important to note that this implementation assumes the input log-weights have already been exponentiated and normalized before calling systematic-resample/normalized. As long as the input to the function is properly preprocessed, the implementation should work as expected.

Of course, there's always the possibility of hidden edge cases or numerical issues that may arise in practice, but from a conceptual standpoint, the implementation appears to be correct.
withal@gmail.com
In `while (and (> u (aref cdf (min index (1- n)))))` isn't the `and` redundant? Could that be a sign the author meant to include an extra expression, but forgot?

You are correct that the and appears to be redundant in this case, as there is only one condition inside it. The line could be simplified to:

lisp

while (> u (aref cdf (min index (1- n))))

It's hard to say for certain whether the author intended to include an extra expression or not. It's possible that the and was included by habit or as a result of a refactoring process that removed a previous condition. Based on the context and the fact that the code appears to work correctly as-is, there doesn't seem to be any indication that an additional expression was intended but left out. The code should work as expected with or without the and.
withal@gmail.com
How would you code this without needing to repeat the expression `(min index (1- n))`?

You can remove the repetition of (min index (1- n)) by using the let or let* form to bind the result of the expression to a variable and then use that variable in the loop. Here's a version of the code that avoids the repetition:

lisp

(defun systematic-resample/normalized (weights)
  (loop with n = (length weights)
        with cdf = (coerce (loop for weight in weights
                                 sum weight into cumulative
                                 collect cumulative)
                           'vector)
        with index = 0
        repeat n
        for u = (random (/ 1d0 n)) then (+ u (/ 1d0 n))
        do (let ((min-index (min index (1- n))))
             (loop while (> u (aref cdf min-index))
                   do (incf index))
             (minf index (1- n))
             collect min-index)))

This version of the code introduces a let form inside the do clause to bind the result of (min index (1- n)) to the variable min-index. Then, min-index is used in place of the original expression in the while condition and the collect form.
withal@gmail.com
It occurs to me you could also have created the cdf array with an extra, large enough, sentinel element at the end.

Yes, you're right. Adding a sentinel element to the cdf array could eliminate the need for the min function, simplifying the code. Here's an example of how that could look:

lisp

(defun systematic-resample/normalized (weights)
  (loop with n = (length weights)
        with cdf = (coerce (loop for weight in weights
                                 sum weight into cumulative
                                 collect cumulative
                                 finally (collect most-positive-double-float))
                           'vector)
        with index = 0
        repeat n
        for u = (random (/ 1d0 n)) then (+ u (/ 1d0 n))
        do (loop while (> u (aref cdf index))
                   do (incf index))
           collect index))

In this version, the cdf array is extended by adding a sentinel value, most-positive-double-float, at the end. This ensures that index will never go out of bounds, and you can directly use (aref cdf index) without the need for the min function. The minf function is also removed, as it's not necessary with the sentinel value.
withal@gmail.com
That looks good. I wonder, would it be possible to replace the inner loop with some kind of division operation -- the logic here is sort of like what you'd do to divide (aref cdr index) by (/ 1d0 n).

It's not quite the same as division, but you can replace the inner loop with a more efficient binary search. The inner loop's purpose is to find the index in the cdf such that the value of u is less than or equal to the value at that index. Instead of iterating through the cdf in a linear fashion, you can use binary search to find the correct index, which would be faster for large arrays.

Here's an example of how the function could be modified to use binary search:

lisp

(defun binary-search (array target)
  (let ((low 0)
        (high (length array)))
    (loop while (< low high)
          do (let ((mid (floor (+ low high) 2)))
               (if (<= (aref array mid) target)
                   (setf low (1+ mid))
                   (setf high mid))))
    low))

(defun systematic-resample/normalized (weights)
  (loop with n = (length weights)
        with cdf = (coerce (loop for weight in weights
                                 sum weight into cumulative
                                 collect cumulative
                                 finally (collect most-positive-double-float))
                           'vector)
        repeat n
        for u = (random (/ 1d0 n)) then (+ u (/ 1d0 n))
        for index = (binary-search cdf u)
        collect index))

In this implementation, the inner loop is replaced with a call to the binary-search function, which takes the cdf array and u as its arguments and returns the index in the cdf where u should be inserted to maintain the sorted order. The binary search has a time complexity of O(log n) compared to the linear search's O(n), making it more efficient for larger arrays.