mpoullet/Discovering the Fourier Transform: A Tutorial on Circulant Matrices, Circular Convolution, and the DFT.md

## Discovering the Fourier Transform: A Tutorial on Circulant Matrices, Circular Convolution, and the DFT.md

      
    Raw
  

              Discovering the Fourier Transform: A Tutorial on Circulant Matrices, Circular Convolution, and the DFT.md
            
          
    Discovering the Fourier Transform: A Tutorial on Circulant Matrices, Circular Convolution, and the DFT


Paper source
video

Summary


Discrete Fourier Transform (DFT) arises naturally out of analysis of circulant matrices
DFT can be derived as the change of basis that simultaneously diagonalizes all circulant matrices

Recall


DFT transforms from the time to frequency domain


An Approach for deriving the DFT

Given a class of matrices (like circulant matrices) or operators, it would be good to find a transformation, or a change of basis (video), in which their matrix representations all have the same structure such as diagonal

simplest scenario is when a class of matrices can be simultaneously diagonalized
To expand on the approach, one can describe a the class of operators using some underlying common symmetry
for example, circulant matrices have a shift invariance property with respect to circular shifts of vectors (have shift-invariant action on periodic functions (with period n))
A basic shift operator generates the group of circulant matrices, and the eigenvalue problem for that shift operator yields the DFT!

In conclusion, this approach generalizes to the other related transforms (Fourier transform, Fourier series, and the z-transform) They can all be discovered by understanding the class of shift-invariant operators involved, and then identifying the basic eigenvalue problem whose solutions yields the correct transform.
Modular Arithmetic, Z_n, and Circular Shifts

Goal: understand the symmetry properties of circulant matrices (Approach 2)
Before going into symmetry, let's understand some properties of the set Z_n := {0, 1, · · · , n − 1}, integers mod n:
mod Notation:

Consider the set of all integers Z. Any two integers k and l such that k − l is a multiple of n as members of the same equivalence class. The infinite set of integers Z becomes a finite set of equivalence classes with this equivalence relation (mod).
Pictorially:


the equivalence classes in columns
the equivalence classes stacked
equivalence classes condensed with equivalence relation, mod
elements of Z_n can be arranged on a discrete circle so that the arithmetic in Z_n is identified with angle addition

Going into the 4th diagram in more detail, there is an isomorphism between Z_n and the nth roots of
unity, {ρ_m}. (proof).
The complex numbers {ρ_m} lie on the unit circle each at a corresponding angle of 2π/n*m counter-clockwise from the real axis (m = 0, 1, ..., n-1). The mapping ρ_m → m is an isomorphism from complex multiplication on {ρm} to modular arithmetic in Z_n.
To see how modular arithmetic hides inside the complex numbers, think about the sequence i^0, i^1, i^2, i^3, i^4, i^5, etc. It goes 1, i, −1, −i, 1, i, etc., repeating forever. The fact that i^3 * i^3 = i^2 in the complex numbers corresponds in a very precise way to the fact that 3 + 3 = 2 in mod-4 arithmetic!
Therefore, you can generalize "modular arithmetic" to complex numbers

Visualizing the n-vector, x = 0, 1, ..., n-1:

A function Z_n, can be thought of as a periodic function (with period n)

Now, let's go into the symmetry property of circulant matrices, so we can explain Approach 2 mentioned above:
Let S and its adjoint S* be the circular shift operators defined by the following action on vectors:


S = circular right-shift operator
S* = circular left-shift operator

S* is the inverse of S, and is the adjoint of S (S* is the transpose of S)


S, the circular right-shift operator commutes with any circulant matrix


Therefore, a matrix M is circulant iff it commutes with the circular shift operator S, i.e. SM = MS.
And it follows that:


Simultaneous Diagonalization of all Circulant Matrices Yields the DFT


Goal: Understand Approach 1, by deriving the DFT as a byproduct of diagonalizing circulant matrices

Any two diagonalizable matrices that commute have the same eigenvectors, but possibly different eigenvalues. All circulant matrices mutually commute, so they all have the same eigenvectors. Therefore, if we find the eigenvectors of any one particular circulant matrix, then we have found the eigenvectors of all circulant matrices!. We can then simultaneously diagonalize all circulant matrices with one transformation using the found eigenvectors.
But what circulant matrix should we use for eigenvector decomposition?
The shift operator, S, is in some sense the most fundamental circulant matrix, and is therefore a good candidate for an eigenvector/eigenvalue decomposition. To find eigenvectors of S (or alternatively of S\∗), we will choose S* since this will end up yielding the classically defined DFT:
Let w be an eigenvector (with eigenvalue λ) of the shift operator S\∗. Note that it is also an eigenvector (with eigenvalue λ^l) of any power (S\∗)^l of S\∗. (A shift matrix of size l, loops back to the original configuration, the identity matrix, at l)

So when we repeat the pattern, we have:

Looking at the equality on the right of the iff, we can see that this implies that λ^n = 1 for it to be true! In other words, any eigenvalue of S* (or S) must be an nth root of unity.

Therefore, we have discovered that the n eigenvalues of S are precisely the n distinct nth roots of unity {ρm}

can be written as

in terms of the first entry, w_0. Then, plug in the nth roots of unity we found for λ^l:

w_0 is a scalar (it's the first value in the eigenvector, w), and since eigenvectors are only unique up to multiplication by a
scalar, we can set w_0 = 1 for a more compact expression for the eigenvector

Definitions

Circulant Matrix


Toeplitz matrix with the additional property that 
completely determined by any one of its rows or columns
share symmetry properties that enable this diagonalization
its eigenvalues are precisely the set of complex numbers {x_hat_k}, the DFT of the vector x that defines the circulant matrix C_x!

Properties of Circulant Matrix


Toeplitz matrix


diagonal-constant matrix
each descending diagonal from left to right is constant
not necessarily square


Discrete Fourier Transform (DFT)

Given the n-vector, x = [x_0, x_1, ..., x_(n-1)] as in the circulant matrix above, its DFT, x_hat, is represented as:

Eigenvalues and Eigenvectors


video
eigen - German word eigen for "proper", "characteristic"
commonly used for stability analysis, vibration analysis, facial recognition, and matrix diagonalization

Example Application


points along the horizontal axis do not move when a shear transformation is applied
any vector that points directly to the right or left with no vertical component is an eigenvector of this transformation, because the mapping does not change its direction.
these eigenvectors all have λ = 1 because the mapping does not change their length

Eigenvector


a non-zero vector that changes by only a scalar factor when that linear transformation is applied to it
if T is a linear transformation from a vector space V over a field F into itself and v is a vector in V that is not the zero vector, then v is an eigenvector of T if T(v) is a scalar multiple of v. This condition can be written as the equation:


an eigenvector v of a linear transformation T is a non-zero vector that, when T is applied to it, does not change direction
found by taking the determinant and finding where det = 0:


Determinant


video
the scaling factor which says how much a linear transformation, or change in basis, scales the area


Shear Mapping


a linear map that displaces each point in fixed direction, by an amount proportional to its signed distance from a line that is parallel to that direction
applying a shear map to a set of points of the plane will change all angles between them (except straight angles), and the length of any line segment that is not parallel to the direction of displacement
will usually distort the shape of a geometric figure (like turning squares into non-square parallelograms and circles into ellipses)

Eigenvalue


a scalar λ

Simultaneous Diagonalization


there exists a single invertible matrix P such that P^(−1)AP is a diagonal matrix for every A in the set
, i.e. a set of matrices is simultaneously diagonalizable iff they share a full set of eigenvectors
alternatively, a set of matrices is simultaneously diagonalizable iff the set of diagonalizable matrices commutes
therefore, given a mutually commuting set of matrices, by finding their shared eigenvectors, one finds that special transformation that simultaneously diagonalizes all of them

(see 15:00 of this video)


a diagonal matrix is one where all the eigenvectors are columns of the matrix and the eigenvalues are on the diagonal

To change from our coordinate system (i, j) to a different coordinate system (transformed i, j):

take the coordinates of the vectors you want to use as a new basis (choose eigenvectors)
create change of basis matrix from eigenvectors and call this matrix, P
inverse change of basis matrix * matrix to transform to new system * change of basis matrix = the transformation, but from the new coordinate system
this guarantees that the new matrix is diagonal, with the eigenvectors on the diagonal!


example problem using change of basis

Commuting Matrices


two matrices A and B are said to commute if AB = BA
a set of matrices A_1, ..., A_k is said to commute if they commute pairwise, meaning that every pair of matrices in the set commute with each other

Diagonalization


example


Shift-Invariant System


Equivalence Class


when the elements of some set S have a notion of equivalence (formalized as an equivalence relation) defined on them, then one may naturally split the set S into equivalence classes


Isomorphism


a bijective homomorphism from one to the other (a homomorphism or morphism that can be reversed by an inverse morphism)
log exp x = x and exp log y = y show that log and exp are inverses of each other. Since log is a homomorphism with an inverse that is also a homomorphism, log is an isomorphism of groups

n-th roots of unity


refer to 1 as “unity”
article


this is cool:


more information on how to solve nth roots of unity