Skip to content

Instantly share code, notes, and snippets.

@mschauer
Last active October 13, 2021 13:07
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
Star You must be signed in to star a gist
Save mschauer/d6a9814a7adf48131ad8f81f569a57f9 to your computer and use it in GitHub Desktop.
Non-parametric Bayesian regression in Fourier domain
n = 1000
x = range(0, 1, length=n)
ς = 1.5 # noise level
μ = 3*x.*sin.(2pi*x) # periodic signal in time domain
#μ = 6*sqrt.(abs.(x .- 0.5)) # this one is difficult to estimate
# Model: Signal distorted by white noise
y = μ + ς*randn(n)
# Prior power decay of prior samples in frequency domain
α = 2.0
# α = 3.0 for type integrated Brownian motion
# α = 2.0 for type Brownian motion prior
# α = 1.0 for a very rough prior (pink noise)
H =- 1)/2 # equivalent Hurst exponent
using FFTW, Statistics, LinearAlgebra, Plots
# Fourier matrix
# ℱ = [exp(2pi*im*i*j/n) for i in fftshift(-n÷2:n÷2-1), j in fftshift(-n÷2:n÷2-1)]
# Define prior on unit interval in frequency domain
# by choosing power decay
function freq_decay(n, α) #ifftshift(-n/2:n/2-1)
fr = 1 ./ fftfreq(n)
fr[1] = 2n # allow for intercept
fr = abs.(fr) .^/2) # scale frequencies to "unit
fr/norm(fr)*sqrt(n)
end
# Canonical posterior contraction rate
contract(n, H) = n^(-H/(1 + 2H))
# Compute posterior mean
γ = ς^(-2) .+ (freq_decay(n, α)).^(-2) # posterior precision in frequency domain
μ̂ = real(ifft.\(fft^(-2)*y))))
m = μ̂ + sqrt(n)*real(ifft(sqrt.(γ).\randn(Complex{Float64},n)));
# note: `fft(x)/sqrt(length(x))` is isometric
# Compute L2 estimation error, compare with theoretical size
@show norm(μ̂ - μ)/sqrt(n), ς*contract(n, H)
# We can even compute the posterior covariance
# Use circulant property of covariance in time domain, could be done with https://github.com/JuliaMatrices/ToeplitzMatrices.jl
σ = 1.96*sqrt.(abs.((ifft(diag(inv(Diagonal(γ)))))))[1]
# Equivalent to
# Γ = ℱ*Diagonal(freq_decay(n, α).^(-2))*ℱ'/n/n # ≈ ifft(Diagonal(n. + freq_decay(n, α).^(-2)))
# σ = 1.96sqrt.(real(diag(inv((I + Γ)))))
# Plot
p = scatter(x, y, markersize=0.5, label="obs")
plot!(x, μ̂, color=:blue, ribbon=σ,fillalpha=0.2, label="posterior mean")
plot!(x, μ, color=:green, label="truth")
z = sqrt(n)*real(ifft(randn(Complex{Float64},n).*abs.(freq_decay(n, α))));
plot!(x, z, label="prior sample")
plot!(x, m, label="posterior sample")
p
### A Pluto.jl notebook ###
# v0.16.1
using Markdown
using InteractiveUtils
# This Pluto notebook uses @bind for interactivity. When running this notebook outside of Pluto, the following 'mock version' of @bind gives bound variables a default value (instead of an error).
macro bind(def, element)
quote
local el = $(esc(element))
global $(esc(def)) = Core.applicable(Base.get, el) ? Base.get(el) : missing
el
end
end
# ╔═╡ 37146eed-cabd-4024-9e91-4b8fe15139a3
# Setup
begin
import Pkg
# careful: this is _not_ a reproducible environment
# activate the global environment
Pkg.activate()
Pkg.add("FFTW")
Pkg.add("Plots")
Pkg.add("PlutoUI")
using FFTW, Statistics, LinearAlgebra, PlutoUI, Plots, Random
end
# ╔═╡ ad88469d-06d5-4ec7-868f-611e999930b4
md"""
## Nonparametric Bayes and Fourier methods in Julia
##### Moritz Schauer `@mschauer`
This notebook takes a tour through non-parametric Bayesian regression in Fourier domain using Julia.
### Content:
* A non-parametric regression problem
* Changing perspective to Fourier domain
* Defining smoothing prior and computing the posterior
* Choosing the right smoothness
But before some notebook setup:
"""
# ╔═╡ 426085d1-1f7d-4186-a8dc-28ffd8fb5343
@bind resample Button("Resample observations!")
# ╔═╡ 0722b786-402d-4ba9-a427-8c87cbc3df24
md"Show hidden ground truth? $(@bind show_truth CheckBox())"
# ╔═╡ fb55fde1-7669-4f91-bc67-2a0172e5ae3e
md"""
## Fourier domain and prior
Taking the vector of observations ``y`` and ground truth vector ``f_i = f(x_i)``,
we can apply the discrete Fourier transformation ``\mathcal F`` to our linear equation
```math
\mathcal F y = \mathcal F (f + \eta)
```
In Julia, ``\mathcal F`` is `fft(x)/sqrt(length(x))`, but you can think of it as a matrix
```math
\mathcal F_{kl} = \exp(2\pi\cdot i \cdot (k l/n)).
```
We obtain the equation in frequency domain with ``\tilde y = \mathcal F y``, ``\tilde f = \mathcal F f`` etc.
```math
\tilde y_i = \tilde f_i + \tilde\eta_i
```
``\mathcal F`` is an ``L_2`` isometry and maps independend and normal noise on the observations to equivalent independent and normal noise in frequency domain,
so here
```math
\tilde\eta_i \sim N(0, \varsigma^2)
```
"""
# ╔═╡ 093506f2-5a52-4217-9dfb-d673ce8b4a29
# Fourier matrix
# ℱ(x) = [exp(2pi*im*i*j/n) for i in fftshift(-n÷2:n÷2-1), j in fftshift(-n÷2:n÷2-1)]*x
(x) = (fft(x)/sqrt(length(x)))
# ╔═╡ 7f5a4d79-1e64-4c99-a588-63bcca7267fe
md"""
We define our prior as
```math
\tilde f_i \sim N(0, \varpi_i^{-\frac{\alpha}{2}})
```
where ``\varpi_i`` is the frequency corresponding to Fourier coordinate ``\tilde f_i``. Julia's `fftfreq` helps with that.
Large values of ``\alpha`` imply higher smoothenss of prior samples.
For example:
* α = 3.0 for type integrated Brownian motion
* α = 2.0 for type Brownian motion prior
* α = 1.0 for a very rough prior (pink noise)
``\alpha`` can be translated into a Hurst parameter related to self similarity of the path under scaling of both axes
```math
H = (\alpha - 1)/2
```
Having a prior in frequency domains means we get samples from our prior for ``f`` via the inverse Fourier transform, either the matrix inverse ``{\mathcal F}^{-1}`` or `sqrt(n)*real(ifft(x))` in Julia for the real part.
"""
# ╔═╡ 012f54e1-01a3-4acd-b7db-d5d3f3bc7d87
# Define prior on unit interval in frequency domain
# by choosing power decay
function freq_decay(n, α) #ifftshift(-n/2:n/2-1)
fr = 1 ./ fftfreq(n)
fr[1] = 2n # allow for intercept
fr = abs.(fr) .^/2) # scale frequencies
fr/norm(fr)*sqrt(n)
end
# ╔═╡ e7c6b81f-202a-43e0-9bc0-ac5eb64060db
md"""
Prior power decay of prior samples in frequency domain
$(@bind α1 Slider(0.5:0.5:4.0))
"""
# ╔═╡ 0e4ea7bf-4eb1-4923-a2e8-a7e2e8bc33a5
@bind resample_prior Button("Resample prior!")
# ╔═╡ 12c188f4-5042-4cc4-9de5-89ea87c08c1e
md"""
## Bayesian regression in Fourier domain
If we have prior, model and data ``\tilde y`` in Fourier domain, we can compute the posterior distribution of the Fourier coefficients
``\tilde f_i``.
This gives a Gaussian posterior distribution for the signal in Fourier domain
```math
\tilde f \sim N(\tilde \mu, \tilde \Sigma) \text{ (a posteriori) }
```
with mean ``\tilde \mu`` and covariance ``\tilde \Sigma``.
With the inverse transform we estimate of ``f \approx \mathcal F^{-1} \tilde\mu`` and quantify uncertainty of the estimate as ``\Sigma = \mathcal F^{-1} \tilde \Sigma \mathcal F^{-1}``.
The marginal uncertainty (diagonal entries of ``\Sigma`` ) can be computed very efficiently using `ifft` aswell.
"""
# ╔═╡ 517c3961-c104-4811-ab4c-7e189761709b
# Equivalent to
# Γ = ℱ*Diagonal(freq_decay(n, α).^(-2))*ℱ'/n/n # ≈ ifft(Diagonal(n. + freq_decay(n, α).^(-2)))
# σ = 1.96sqrt.(real(diag(inv((I + Γ)))))
# ╔═╡ c51271a7-0614-445f-87ec-a5bc2fee1768
@bind n Slider(10:10:1000) # number of observations
# ╔═╡ 64feb3c2-919b-4246-9fbf-e836a0c80ed0
# Design points
x = range(0, 1, length=n);
# ╔═╡ c502c779-e776-490d-9ec7-02a3379aaa28
# Ground truth
f = 3*x.*sin.(2pi*x); # periodic signal in x (time domain)
# f = 6*sqrt.(abs.(x .- 0.5)) # this one is difficult to estimate
# ╔═╡ b7c5ee08-5d2d-4775-a232-472a033b59a1
begin
resample_prior
z1 = sqrt(n)*real(ifft(randn(Complex{Float64},n).*abs.(freq_decay(n, α1))));
plot(x, z1, label="prior sample")
end
# ╔═╡ a97a39ff-059e-4622-ae2e-219d8464ecb2
@bind ς Slider(0.5:0.5:2.0) # noise level
# ╔═╡ c29066a9-574d-4ce1-ba9b-abb44aeb8fa0
md"""
## A nonparametric regression problem
We are interested in recovering the function ``f`` from ``n=`` $(n) noisy observations ``y_i`` at points ``x_i``
```math
y_i = f(x_i) + \eta_i, \qquad \eta_i \sim N(0,\varsigma^2),
```
where
``\varsigma =`` $ς is the noise level.
* You can choose the values for ``n`` and ``\varsigma`` with the sliders below!
"""
# ╔═╡ a4ff81fe-a9cf-4789-aa68-5838c58c1f69
# Model: Signal distorted by white noise
begin
resample
y = f + ς*randn(n)
end;
# ╔═╡ 873ba234-b4b1-4678-8c67-086cd3b149bc
begin
p1 = scatter(x, y, markersize=10/sqrt(n), label="obs", title="Data")
show_truth && plot!(p1, x, f, color=:green, label="truth")
p1
end
# ╔═╡ 0071252a-42cb-4967-b426-7d28ce054ba1
begin
p2 = scatter(x, real.((y)), markersize=10/sqrt(n), color=:black, label="obs", title="Truth and observation in frequency domain")
scatter!(p2, x, imag.((y)), markersize=10/sqrt(n), color=:black, alpha=0.1, label="obs (imag)", title="Truth and observation in frequency domain")
plot!(p2, x, real.((f)), color=:green, label="truth")
plot!(p2, x, imag.((f)), color=:green, alpha=0.3, label="truth (im)")
p2
end
# ╔═╡ af6df9cd-cb2f-48de-9697-db23bb7e3804
# Prior power decay of prior samples in frequency domain
@bind α Slider(0.5:0.5:2.0)
# ╔═╡ 5a475143-5078-4033-84cc-ec6abd3c86c4
md"Chosen prior smoothness α = $(α1), H = $(H = (α - 1)/2)"
# ╔═╡ 9c48ca1f-7ab3-4b72-a9bd-4b0d6c1ee8db
# posterior precision in frequency domain
γ = ς^(-2) .+ (freq_decay(n, α)).^(-2);
# ╔═╡ fb008be0-b823-4eaf-b2c7-eb189041f0c8
# posterior mean
μ̂ = real(ifft.\(fft^(-2)*y))))
# ╔═╡ e0e75144-aa8f-433e-90c5-bfa9bbdfac17
# We can even compute the posterior covariance
# Use circulant property of covariance in time domain, could be done with https://github.com/JuliaMatrices/ToeplitzMatrices.jl
σ = 1.96*sqrt.(abs.((ifft(diag(inv(Diagonal(γ)))))))[1]
# ╔═╡ 8ba316ed-2c77-4ddf-9e57-e673b4ef00ec
@bind resample_post Button("New posterior sample!")
# ╔═╡ 23bfdb98-1001-4f49-a841-292ebe4c0246
# posterior sample
begin
resample_post
m = μ̂ + sqrt(n)*real(ifft(sqrt.(γ).\randn(Complex{Float64},n)))
end;
# ╔═╡ 6615b4b2-26fc-4a80-9576-810c7f919270
md"Show truth? $(@bind show_truth2 CheckBox())"
# ╔═╡ c78bd411-c9f4-4e1f-8c96-ea97480d9cd9
begin
global p = scatter(x, y, markersize=10/sqrt(n), label="obs")
plot!(x, μ̂, color=:blue, ribbon=σ, fillalpha=0.2, label="posterior mean and band")
show_truth2 && plot!(x, f, color=:green, alpha=0.8, label="truth")
resample_post
plot!(x, m, label="posterior sample")
end
# ╔═╡ 09af6b8a-73a1-4e87-8a89-ef7ac72c72b4
md"""
## What is the right choice of ``\alpha``
That will depend on the smoothness of the truth we want to estimate. If we know that apriori ``f`` is very smooth, we prefer larger values of ``\alpha``
Dutch Bayesians (and I should count myself as one) like to think about how to choose ``\alpha`` in an optimal way.
One way to reason about this, is to investigate the frequentist/asymptotic properties of Bayesian procedures (insert Galaxy brain meme).
For the example one finds that the posterior puts most of it's mass near the true function ``f`` with increasing probability if ``n`` becomes large.
This happens at a rate depending on ``\alpha`` as long as we match ``\alpha`` to the smoothness of the underlying truth.
```math
e = \varsigma n^{-H/(1 + 2H)}
```
Let's compute our ``L_2`` estimation error and compare with theoretical size
* `norm(μ̂ - f)/sqrt(n)` = $(norm(μ̂ - f)/sqrt(n))
* `e` = $(ς*n^(-H/(1 + 2H)))
Harry van Zanten's summer school slides are a nice starting point:
* https://warwick.ac.uk/fac/sci/statistics/crism/workshops/masterclassapril/
* https://fdnss.files.wordpress.com/2016/03/vanzanten-slides.pdf
"""
# ╔═╡ Cell order:
# ╠═ad88469d-06d5-4ec7-868f-611e999930b4
# ╠═37146eed-cabd-4024-9e91-4b8fe15139a3
# ╠═c29066a9-574d-4ce1-ba9b-abb44aeb8fa0
# ╠═64feb3c2-919b-4246-9fbf-e836a0c80ed0
# ╠═c502c779-e776-490d-9ec7-02a3379aaa28
# ╠═a4ff81fe-a9cf-4789-aa68-5838c58c1f69
# ╠═426085d1-1f7d-4186-a8dc-28ffd8fb5343
# ╠═0722b786-402d-4ba9-a427-8c87cbc3df24
# ╠═873ba234-b4b1-4678-8c67-086cd3b149bc
# ╠═fb55fde1-7669-4f91-bc67-2a0172e5ae3e
# ╠═093506f2-5a52-4217-9dfb-d673ce8b4a29
# ╠═0071252a-42cb-4967-b426-7d28ce054ba1
# ╠═7f5a4d79-1e64-4c99-a588-63bcca7267fe
# ╠═012f54e1-01a3-4acd-b7db-d5d3f3bc7d87
# ╠═e7c6b81f-202a-43e0-9bc0-ac5eb64060db
# ╠═5a475143-5078-4033-84cc-ec6abd3c86c4
# ╠═0e4ea7bf-4eb1-4923-a2e8-a7e2e8bc33a5
# ╠═b7c5ee08-5d2d-4775-a232-472a033b59a1
# ╠═12c188f4-5042-4cc4-9de5-89ea87c08c1e
# ╠═9c48ca1f-7ab3-4b72-a9bd-4b0d6c1ee8db
# ╠═fb008be0-b823-4eaf-b2c7-eb189041f0c8
# ╠═23bfdb98-1001-4f49-a841-292ebe4c0246
# ╠═e0e75144-aa8f-433e-90c5-bfa9bbdfac17
# ╠═517c3961-c104-4811-ab4c-7e189761709b
# ╠═c51271a7-0614-445f-87ec-a5bc2fee1768
# ╠═a97a39ff-059e-4622-ae2e-219d8464ecb2
# ╠═af6df9cd-cb2f-48de-9697-db23bb7e3804
# ╠═8ba316ed-2c77-4ddf-9e57-e673b4ef00ec
# ╠═6615b4b2-26fc-4a80-9576-810c7f919270
# ╠═c78bd411-c9f4-4e1f-8c96-ea97480d9cd9
# ╠═09af6b8a-73a1-4e87-8a89-ef7ac72c72b4
@mschauer
Copy link
Author

mschauer commented May 26, 2021

npbfourier

@mschauer
Copy link
Author

mschauer commented May 26, 2021

Fixed the marginal 2sigma credibility band

@mschauer
Copy link
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment