Discovering the Fourier Transform: A Tutorial on Circulant Matrices, Circular Convolution, and the DFT
- Discrete Fourier Transform (
DFT
) arises naturally out of analysis ofcirculant matrices
- DFT can be derived as the
change of basis that simultaneously diagonalizes all circulant matrices
Given a class of matrices (like circulant matrices
) or operators, it would be good to find a transformation, or a change of basis (video), in which their matrix representations all have the same structure such as diagonal
- simplest scenario is when a class of matrices can be simultaneously diagonalized To expand on the approach, one can describe a the class of operators using some underlying common symmetry
- for example,
circulant matrices
have a shift invariance property with respect to circular shifts of vectors (have shift-invariant action on periodic functions (with period n)) - A
basic shift operator
generates the group ofcirculant matrices
, and the eigenvalue problem for that shift operator yields theDFT
!
In conclusion, this approach generalizes to the other related transforms (Fourier transform
, Fourier series
, and the z-transform
) They can all be discovered by understanding the class of shift-invariant operators involved, and then identifying the basic eigenvalue problem whose solutions yields the correct transform.
Goal: understand the symmetry properties of circulant matrices (Approach 2)
Before going into symmetry, let's understand some properties of the set Z_n := {0, 1, · · · , n − 1}
, integers mod n
:
Consider the set of all integers Z. Any two integers k and l such that k − l is a multiple of n as members of the same equivalence class. The infinite set of integers Z becomes a finite set of equivalence classes with this equivalence relation (mod).
- the equivalence classes in columns
- the equivalence classes stacked
- equivalence classes condensed with equivalence relation, mod
- elements of Z_n can be arranged on a discrete circle so that the arithmetic in Z_n is identified with angle addition
Going into the 4th diagram in more detail, there is an isomorphism between Z_n and the nth roots of unity, {ρ_m}. (proof).
The complex numbers {ρ_m} lie on the unit circle each at a corresponding angle of 2π/n*m counter-clockwise from the real axis (m = 0, 1, ..., n-1
). The mapping ρ_m → m
is an isomorphism
from complex multiplication on {ρm}
to modular arithmetic in Z_n
.
To see how modular arithmetic hides inside the complex numbers, think about the sequence i^0, i^1, i^2, i^3, i^4, i^5
, etc. It goes 1, i, −1, −i, 1, i
, etc., repeating forever. The fact that i^3 * i^3 = i^2
in the complex numbers corresponds in a very precise way to the fact that 3 + 3 = 2 in mod-4 arithmetic
!
Therefore, you can generalize "modular arithmetic" to complex numbers
Visualizing the n-vector, x = 0, 1, ..., n-1
:
A function Z_n, can be thought of as a periodic function
(with period n)
Now, let's go into the symmetry property of circulant matrices, so we can explain Approach 2 mentioned above:
Let S and its adjoint S* be the circular shift operators
defined by the following action on vectors:
- S = circular right-shift operator
- S* = circular left-shift operator
S* is the inverse of S, and is the adjoint of S (S* is the transpose of S)
Therefore, a matrix M is circulant iff it commutes with the circular shift operator S, i.e. SM = MS.
- Goal: Understand Approach 1, by deriving the DFT as a byproduct of diagonalizing circulant matrices
Any two diagonalizable
matrices that commute
have the same eigenvectors, but possibly different eigenvalues. All circulant matrices mutually commute
, so they all have the same eigenvectors. Therefore, if we find the eigenvectors of any one particular circulant matrix, then we have found the eigenvectors of all circulant matrices!. We can then simultaneously diagonalize
all circulant matrices with one transformation using the found eigenvectors.
But what circulant matrix should we use for eigenvector decomposition?
The shift operator
, S, is in some sense the most fundamental circulant matrix, and is therefore a good candidate for an eigenvector/eigenvalue decomposition. To find eigenvectors of S (or alternatively of S\∗), we will choose S* since this will end up yielding the classically defined DFT:
Let w be an eigenvector (with eigenvalue λ
) of the shift operator S\∗. Note that it is also an eigenvector (with eigenvalue λ^l
) of any power (S\∗)^l of S\∗. (A shift matrix of size l
, loops back to the original configuration, the identity matrix, at l
)
So when we repeat the pattern, we have:
Looking at the equality on the right of the iff, we can see that this implies that λ^n = 1
for it to be true! In other words, any eigenvalue of S* (or S) must be an nth root of unity.
Therefore, we have discovered that the n eigenvalues of S are precisely the n distinct nth roots of unity {ρm}
can be written as in terms of the first entry, w_0. Then, plug in the nth roots of unity we found for λ^l:
w_0 is a scalar (it's the first value in the eigenvector, w), and since eigenvectors are only unique up to multiplication by a
scalar, we can set w_0 = 1
for a more compact expression for the eigenvector
- Toeplitz matrix with the additional property that
- completely determined by any one of its rows or columns
- share
symmetry properties
that enable this diagonalization - its
eigenvalues
are precisely the set of complex numbers {x_hat_k}, theDFT
of the vector x that defines the circulant matrix C_x!
- diagonal-constant matrix
- each descending diagonal from left to right is constant
- not necessarily square
Given the n-vector
, x = [x_0, x_1, ..., x_(n-1)] as in the circulant matrix above, its DFT
, x_hat, is represented as:
- video
eigen
- German word eigen for "proper", "characteristic"- commonly used for stability analysis, vibration analysis, facial recognition, and matrix diagonalization
- points along the horizontal axis do not move when a
shear transformation
is applied - any vector that points directly to the right or left with no vertical component is an
eigenvector
of this transformation, because the mapping does not change its direction. - these eigenvectors all have
λ = 1
because the mapping does not change their length
- a
non-zero vector
that changes by only ascalar factor
when that linear transformation is applied to it - if T is a linear transformation from a vector space V over a field F into itself and v is a vector in V that is not the zero vector, then v is an eigenvector of T if T(v) is a scalar multiple of v. This condition can be written as the equation:
- an eigenvector v of a linear transformation T is a non-zero vector that, when T is applied to it, does not change direction
- found by taking the determinant and finding where
det = 0
:
- video
- the scaling factor which says how much a linear transformation, or change in basis, scales the area
- a linear map that displaces each point in
fixed direction
, by an amountproportional
to itssigned distance
from a line that isparallel
to that direction - applying a shear map to a set of points of the plane will change all angles between them (except straight angles), and the length of any line segment that is not parallel to the direction of displacement
- will usually distort the shape of a geometric figure (like turning squares into non-square parallelograms and circles into ellipses)
- a scalar
λ
- there exists a single
invertible matrix
P such that P^(−1)AP is adiagonal matrix
for every A in the set , i.e. a set of matrices issimultaneously diagonalizable
iff theyshare a full set of eigenvectors
- alternatively, a set of matrices is
simultaneously diagonalizable
iff the set ofdiagonalizable matrices commutes
- therefore, given a mutually commuting set of matrices, by finding their shared eigenvectors, one finds that special transformation that simultaneously diagonalizes all of them
(see 15:00 of this video)
- a diagonal matrix is one where all the eigenvectors are columns of the matrix and the eigenvalues are on the diagonal
To change from our coordinate system (i, j) to a different coordinate system (transformed i, j):
- take the coordinates of the vectors you want to use as a new basis (choose eigenvectors)
- create change of basis matrix from eigenvectors and call this matrix, P
- inverse change of basis matrix * matrix to transform to new system * change of basis matrix = the transformation, but from the new coordinate system
- this guarantees that the new matrix is diagonal, with the eigenvectors on the diagonal!
- two matrices A and B are said to commute if
AB = BA
- a set of matrices
A_1, ..., A_k
is said to commute if they commutepairwise
, meaning that every pair of matrices in the set commute with each other
- when the elements of some set S have a notion of
equivalence
(formalized as anequivalence relation
) defined on them, then one may naturally split the set S intoequivalence classes
- a
bijective homomorphism
from one to the other (ahomomorphism
ormorphism
that can be reversed by aninverse morphism
) log exp x = x
andexp log y = y
show thatlog
andexp
are inverses of each other. Sincelog
is ahomomorphism
with an inverse that is also ahomomorphism
,log
is an isomorphism of groups
- refer to 1 as “unity”
- article