31.2 Fourier analysis ‣ Chapter 31 Numerical methods ‣ Part IV Algorithms ‣ A calculus of the absurd‣ Chapter 31 Numerical methods ‣ Part IV Algorithms ‣ A calculus of the absurd

31.2.1 Convolution

The convolution of two vectors $\mathbf{a}=[a_{1},a_{2},...,a_{m}]$ and $\mathbf{b}=[b_{1},b_{2},...,b_{n}]$ is defined as the vector

(\mathbf{c})_{k}=\sum_{i+j=k}a_{i}b_{j}=\sum_{0\leq j\leq n-1}a_{j}b_{k-j}

(31.67)

that is, the $k$ th member of $\mathbf{c}$ is formed by multiplying all the members of $\mathbf{a}$ and $\mathbf{b}$ whose indices sum to $k$ .

Note that we define $a_{j}=b_{j}=0$ for all indices $j$ which are not part of the vector.

This is quite a useful operation, for example you may recognise it from the multiplication of polynomials where we have

Example 31.2.1

Multiplication of polynomials

\left(\sum_{0\leq i\leq n}a_{i}z^{i}\right)\left(\sum_{0\leq i\leq m}b_{i}z^{i% }\right)=\sum_{k=0}^{n+m}\left[\sum_{i+j=n}a_{i}b_{i}\right]z^{i+j}

(31.68)

which should feel natural because every term in the product is formed by multiplying one term in $a$ with one term in $b$ , that is each term in the result is formed from

a_{i}z^{i}b_{j}z^{j}=a_{i}b_{j}z^{i+j}

(31.69)

and our sum just collects all the $i$ and $j$ which sum to give $k$ .

Another example comes from probability, for example consider the probability that the sum of two $n$ sided dice is $k$ .

Example 31.2.2

Probability of the sum of dice

TODO

Convolution is a very useful operation – it turns out that it is also an operation that can be computed efficiently. We will see that the periodic convolution (which we can write the general convolution just introduced in this section) can be calculated very quickly using the Fourier transform (in a later section).

31.2.2 Periodic convolution

Sometimes we encounter “periodic” sequences. For example the sequence

x=(...,1,2,1,2,...)

(31.70)

is periodic because in general

x_{j}=x_{j+2}

(31.71)

that is, it repeats every $2$ terms. The nice thing about periodic sequences is that although they are infinite, we only need to look at a finite portion of them to derive results which apply to the rest of the sequence.

Definition 31.2.1

Periodic convolution.

Let $a,b$ be $n$ -periodic integer sequences. Then the periodic convolution of $a$ and $b$ is an $n$ -periodic integer sequence given by

(a*_{n}b)_{k}=\sum_{j=0}^{n-1}a_{j}b_{k-j}

(31.72)

We can also define the periodic convolution of two vectors (and not just infinite sequences) with (almost) the same formula

(a*_{N}b)_{k}=\sum_{j=0}^{n-1}a_{j}b_{k-j\mod N}

(31.73)

where $k-j\mod N$ refers to modulo arithmetic (i.e. $-1=N-1\mod N$ , $-2=N-2\mod N$ ) which is explored in more detail in Chapter 22.2.

We are interested in discrete convolution because it is useful in computing interesting things, and we are interested in periodic convolution because of these two attributes (which will we will show later)

•

the discrete convolution of two sequences can be computed using only the periodic convolution
•

the periodic convolution of two vectors can be computed very quickly using the Fourier transform

The first one we will show right now; let $\mathbf{a}$ be an $n$ dimensional vector and let $b$ be an $m$ dimensional vector. We will then seek to find some modified vectors $A,B$ and a natural number $N$ such that

\mathbf{a}*\mathbf{b}=(A*_{N}B)_{x:y}.

(31.74)

We know that $(a*b)$ is an $n+m$ dimensional vector. Therefore our periodic convolution must be a periodic sequence with period at least $n+m$ . It turns out that if we guess $N=n+m$ we will get lucky.

\displaystyle(A*_{N}B)_{k}

\displaystyle=\sum_{j=0}^{n-1}A_{j}B_{k-j\mod N}

(31.75)

and we want $k-j\mod N$ to obey the formula

B_{k-j\mod N}=\begin{cases}b_{k-j}&\text{ if $0\leq k-j\mod N\leq m$}\\ 0&\text{ if $m<k-j\mod N$}\end{cases}

(31.77)

that is, the $i$ th element of $B$ will be

B_{i}=\begin{cases}a_{i}&\text{ if $0\leq i\leq m$}\\ 0&\text{otherwise}\end{cases}

(31.78)

and $A_{i}$ will just be $a$ with zeroes padded on the end.

31.2.3 The Fourier series

A Taylor series is an infinite power series $f(x)$ such that

f(x)=a_{0}+a_{1}x+...+a_{n}x^{n}+...

(31.79)

and one way to think about this is that as we add more and more terms to our series we get a better and better approximation for $f(x)$ . In the discrete (i.e. finite) case we can fit an $n$ -degree polynomial to $n$ points. In the infinite case we can perfectly represent a function (provided it is infinitely differentiable).

Another way to interpret this is as the projection of a function $f(x)$ onto the set of polynomials.

But why restrict ourselves to polynomials? We can also represent functions in terms of trigonometric functions. For example, we can write a function as

f(x)=a_{0}+a_{1}\cos(x)+b_{1}\sin(x)+a_{2}\cos(2x)+b_{2}\sin(2x)+...

(31.80)

We are essentially writing $f(x)$ as a polynomial in $\sin$ and $\cos$ . It helps to remember that we can write $\cos(nx)$ and $\sin(nx)$ as $n$ th degree polynomials in $\cos(x)$ and $\sin(x)$ respectively (see Example 14.7.2 for the case for $n=3$ if you’re not sure).

To find the coefficients we can integrate with respect to $\sin$ or $\cos$ . For example to find the coefficient $b_{2}$ we can integrate both sides with respect to $\sin(2x)dx$ between $0$ and $2\pi$ , i.e.

$\displaystyle\int_{0}^{2\pi}f(x)\sin(2x)dx$	$\displaystyle=a_{0}\int_{0}^{2\pi}\sin(2x)dx+a_{1}\int_{0}^{2\pi}\cos(x)\sin(2% x)dx$	(31.81)
	$\displaystyle\quad+b_{1}\int_{0}^{2\pi}\sin(x)\sin(2x)dx+a_{2}\int_{0}^{2\pi}% \cos(2x)\sin(2x)dx$	(31.82)
	$\displaystyle\quad+b_{2}\int_{0}^{2\pi}\sin(2x)^{2}dx+...$	(31.83)

and it turns out that $\sin$ and $\cos$ are orthogonal when considering integration as the inner product. That is to say, most of the integrals are zero.

31.2.4 There…

Let us now turn to the discrete case with equally spaced points. Instead of a function $f(x)$ , let us assume that we have a series of points

\mathbf{c}=[y_{0},y_{1},...,y_{n-1}]

(31.84)

for which we want to find a set of points

\mathbf{c}=[c_{0},c_{1},....,c_{n-1}]

(31.85)

such that

c_{0}+c_{1}e^{kxi}+c_{2}e^{2kxi}+...+c_{n-1}e^{(n-1)kxi}=y_{k}.

(31.86)

We use $e^{ix}$ in lieu of a purely real number formulation because working over the complex numbers makes the algebra much easier.

If we consider Equation 31.86 we can form an $n\times n$ coefficient matrix $F$ and try to solve the linear system

F\mathbf{c}=\mathbf{y}

(31.87)

for $\mathbf{c}$ .

If like me you find this a bit confusing I would recommend trying out the cases for some small values of $n$ and $\mathbf{y}$ .

In the general case we will use

x=k\frac{2\pi}{n}

(31.88)

as the $x$ value for the $k$ th equation. Note that $e^{2\pi/i}$ is the principal $n$ th root of unity, a fact that will come in very handy later.

This means that the coefficient at $x,y$ in $F$ will be $e^{xy\frac{2\pi}{n}i}$ . For example the matrix $F_{4}$ will look like

F_{4}=\begin{pmatrix}1&1&1&1\\ 1&\omega_{4}&\omega_{4}^{2}&\omega_{4}^{3}\\ 1&\omega_{4}^{2}&\omega_{4}^{3}&\omega_{4}^{6}\\ 1&\omega_{4}^{3}&\omega_{4}^{6}&\omega_{4}^{9}\end{pmatrix}

(31.89)

Of course, on the face of it this isn’t very helpful because computing the solution to $Fc$ takes $O(n^{3})$ which isn’t great. It turns out that the inverse of $F$ is very easy to compute, and we can shave this down to $O(n^{2})$ , and then using a clever divide-and-conquer algorithm we will be able to compute the discrete Fourier transform in $O(n\log n)$ which is really fast!

31.2.5 …and back again

It turns out that we can go back the other way using the matrix

F_{n}^{-1}=\frac{1}{n}\begin{pmatrix}1&1&...&1\\ 1&\omega^{-1}&...&\omega^{-(n-1)}\\ ...&\omega^{-2}&...&\omega^{-2(n-1)}\\ 1&\omega^{-(n-1)}&...&\omega^{-(n-1)^{2}}\end{pmatrix}

(31.90)

and we can compute $\mathbf{c}=F^{-1}_{n}\mathbf{y}$ in $O(n^{2})$ . This is actually a very slow algorithm, and in Section 31.2.7 it turns out that it is possible to cleverly compute this in $O(n\log n)$ time (which is very fast and essentially almost linear).

Theorem 31.2.1

$F_{n}^{-1}$ is the inverse of $F$

Proof let $1\leq x,y\leq n$ be natural numbers, then

$\displaystyle(F_{n}F_{n}^{-1}))_{x,y}$	$\displaystyle=\sum_{r=0}^{n-1}(F_{n})_{x,r}(F_{n}^{-1})_{r,y}$	(31.91)
	$\displaystyle=\sum_{r=0}^{n-1}e^{xr\frac{2\pi}{n}}\times\left(\frac{1}{n}% \right)\times e^{-ry\frac{2\pi}{n}}$	(31.92)
	$\displaystyle=(1/n)\sum_{r=0}^{n-1}e^{xr\frac{2\pi}{n}-ry\frac{2\pi}{n}}$	(31.93)
	$\displaystyle=(1/n)\sum_{r=0}^{n-1}e^{(r(x-y)\frac{2\pi}{n}}$	(31.94)

in the case where $x=y$ we therefore find that $x-y=0$ and thus

(F_{n}F_{n}^{-1}))_{x,y}=(1/n)\sum_{r=0}^{n-1}e^{(r\times 0\times\frac{2\pi}{n% }}=(1/n)\sum_{r=0}^{n-1}1=n/n=1.

(31.95)

Now we turn to the case where $x\neq y$ , in this case

$\displaystyle(F_{n}F_{n}^{-1}))_{x,y}$	$\displaystyle=(1/n)\sum_{r=0}^{n-1}e^{(r(x-y)\frac{2\pi}{n}}$	(31.96)
	$\displaystyle=(1/n)\sum_{r=0}^{n-1}\omega^{r(x-y)}$	(31.97)
	$\displaystyle=(1/n)\omega_{n}^{(x-y)}\sum_{r=0}^{n-1}\omega^{r}$	(31.98)

Note that the sum of the $n$ th roots of unity is $0$ , so we obtain $0$ here. A proof of this can be found in Theorem 14.8.1. Another nice proof uses the sum for the geometric series

(1+x+x^{2}+...+x^{n-1})=\frac{x^{n}-1}{x-1}

(31.99)

which we can apply to $\sum_{r=0}^{n-1}\omega^{r}$ to get

1+\omega+...+\omega^{n-1}=\frac{\omega^{n}-1}{\omega-1}=\frac{0}{\omega-1}=0.

(31.100)

31.2.6 The convolution theorem

Theorem 31.2.2

The convolution theorem

The convolution theorem tells us that in the “frequency domain” we can compute the Fourier transform using element-wise multiplication

F_{n}(x*_{n}y)=F_{n}(x)\odot F_{n}(y)

(31.101)

where $\odot$ is the element-wise multiplication operator.

Proof

Consider $x*_{n}y$ which we can write as

Here is a slightly tedious proof, but in Section 31.2.10 there is an alternative proof of this using “circulant” matrices.

Let us consider

$\displaystyle[F_{n}(x*_{n}y)]_{k}$	$\displaystyle=\sum_{m=0}^{n-1}e^{km\frac{2\pi i}{n}}(x*_{n}y)_{m}$	(31.102)
	$\displaystyle=\sum_{m=0}^{n-1}e^{km\frac{2\pi i}{n}}\sum_{p=0}^{n-1}x_{p}y_{m-p}$	(31.103)
	$\displaystyle=\sum_{m=0}^{n-1}e^{kp\frac{2\pi i}{n}}e^{k(m-p)\frac{2\pi i}{n}}% \sum_{p=0}^{n-1}x_{p}y_{m-p}$	(31.104)

and now we can re-order the sums giving us

$\displaystyle\sum_{p=0}^{n-1}e^{k(m-p)\frac{2\pi i}{n}}y_{m-p}\sum_{m=0}^{n-1}% x_{p}e^{kp\frac{2\pi i}{n}}$	$\displaystyle=\sum_{p=0}^{n-1}e^{k(m-p)\frac{2\pi i}{n}}y_{m-p}\sum_{m=0}^{n-1% }x_{p}e^{kp\frac{2\pi i}{n}}$	(31.106)
	$\displaystyle=\sum_{p=0}^{n-1}e^{k(m-p)\frac{2\pi i}{n}}y_{m-p}F_{n}(x)$	(31.107)
	$\displaystyle=F_{n}(x)\sum_{p=0}^{n-1}e^{k(m-p)\frac{2\pi i}{n}}y_{m-p}$	(31.108)
	$\displaystyle=F_{n}(x)\sum_{p^{\prime}=0}^{n-1}e^{k(p^{\prime})\frac{2\pi i}{n% }}y_{p^{\prime}}$	(31.109)
	$\displaystyle=F_{n}(x)\odot F_{n}(y)$	(31.110)

which is not fun, but if you enjoy this kind of thing bully for you!

31.2.7 “Geht es besser?” (The Fast Fourier Transform)

This is a great German phrase which seems to be the basis (pun unintended) for a lot of discoveries in algorithms. It means “can we do better”, and the question is usually used rhetorically when introducing a faster solution to an algorithmic problem.

It turns out that we are carrying out a lot of extra work when computing the Fourier transform. Currently we are carrying out a matrix-vector computation to compute it, which we can do in $O(n^{2})$ time (very slow). This section outlines the Cooley–Tukey FFT (Fast Fourier Transform) algorithm which achieves a much better time complexity of $O(n\log n)$ .

Let us consider the $n$ -dimensional case. Here we want to compute $c$ such that

F_{n}c=y

(31.111)

The matrix $F_{n}$ is written in terms of the principal $n$ th root of unity, which has the important attribute that for

n=\frac{m}{2}

(31.112)

that

(\omega_{n})^{2}=\omega_{m}

(31.113)

We will use this when computing

\sum_{r=0}^{n}\omega^{rk}_{n}c_{r}

(31.114)

by splitting the sum

\sum_{r=0}^{n}c_{r}\omega^{rk}_{n}=\sum_{r=0}^{n/2}\omega^{2rk}_{n}c_{2r}+\sum% _{r=0}^{n/2}\omega^{(2r+1)k}_{n}c_{2r+1}

(31.115)

The first part of the sum satisfies

\sum_{r=0}^{n/2}\omega^{2rk}_{n}c_{2r}=\sum_{r=0}^{n/2}\omega^{rk}_{m}c_{2r}

(31.116)

and the second part of the sum satisfies

\sum_{r=0}^{n/2}\omega^{2rk}_{n}\omega^{k}_{n}c_{2r+1}=\omega^{k}_{n}\sum_{r=0% }^{n/2}\omega^{2rk}_{n}c_{2r+1}=\omega^{k}_{n}\sum_{r=0}^{n/2}\omega^{rk}_{m}c% _{2r+1}.

(31.117)

Therefore we can write $F_{n}$ in terms of $F_{m}$

\sum_{r=0}^{n}\omega^{rk}_{n}c_{r}=\sum_{r=0}^{n/2}\omega^{rk}_{m}c_{2r}+% \omega^{k}_{n}\sum_{r=0}^{n/2}\omega^{rk}_{m}c_{2r+1}

(31.118)

This algebra takes a moment to digest, so I would read it a few times to get the idea.

We can now analyse the time complexity of this algorithmic approach. Let $T(n)$ denote the number of operations to compute the DFT for dimension $n$ , then at each stage we

•

compute $F_{m}$ of all the odd coefficients, which takes time $T(n/2)$
•

compute $F_{m}$ of all the even coefficients, which also takes time $T(n/2)$
•

combine the results by adding them together and doing a bit of multiplication which we can do in time $O(n)$

therefore in aggregate our recurrence looks like

T(n)=\begin{cases}2T(n/2)+n&\text{if $n\geq u$}\\ c&\text{otherwise}\end{cases}

(31.119)

where $u$ is some cutoff and $c$ is a constant. Iterating the recurrence, we find that

$\displaystyle T(n)$	$\displaystyle=2T(n/2)+n$	(31.120)
	$\displaystyle=2(2T(n/4)+(n/2))+n$	(31.121)
	$\displaystyle=4T(n/4)+n+n$	(31.122)
	$\displaystyle=c+2^{2}(2T(n/8)+(n/4))+n+n$	(31.123)
	$\displaystyle\approx c+\sum_{n=0}^{\log_{2}(n)}n$	(31.124)
	$\displaystyle=c+n\log(n)$	(31.125)
	$\displaystyle=O(n\log(n))$	(31.126)

which is a very fast algorithm. Hence it is called the “Fast Fourier Transform”.

31.2.8 Filtering with Fourier

TODO

31.2.9 Circulant matrices

A circulant matrix is a matrix which has this structure

\begin{pmatrix}c_{0}&c_{1}&...&c_{n-2}&c_{n-1}\\ c_{1}&c_{2}&...&c_{n-1}&c_{0}\\ c_{2}&c_{3}&...&c_{0}&c_{1}\\ ...&...&...&...&...\\ c_{n-1}&c_{0}&...&c_{n-3}&c_{n-2}\end{pmatrix}

(31.127)

in words; we “rotate” the first by one to obtain each of the subsequence rows.

Let us consider the eigenvalues of this special matrix. When we compute eigenvalues we want quite a lot of things to stay the same; we know that every row contains the same values (if in a different order) and addition is commutative, so if we just take all the values (i.e. use the vector $x=\mathbf{1}$ ) we will find that

\begin{pmatrix}c_{0}&c_{1}&...&c_{n-2}&c_{n-1}\\ c_{1}&c_{2}&...&c_{n-1}&c_{0}\\ c_{2}&c_{3}&...&c_{0}&c_{1}\\ ...&...&...&...&...\\ c_{n-1}&c_{0}&...&c_{n-3}&c_{n-2}\end{pmatrix}\begin{pmatrix}1\\ 1\\ 1\\ ...\\ 1\end{pmatrix}=\begin{pmatrix}c_{0}+c_{1}+...+c_{n-2}+c_{n-1}\\ c_{1}+c_{2}+...+c_{n-1}+c_{0}\\ c_{2}+c_{3}+...+c_{0}+c_{1}\\ ...\\ c_{n-1}+c_{0}+...+c_{n-3}+c_{n-2}\end{pmatrix}

(31.128)

if we let $\lambda=c_{0}+c_{1}+...+c_{n-1}$ then we can write

\begin{pmatrix}c_{0}+c_{1}+...+c_{n-2}+c_{n-1}\\ c_{1}+c_{2}+...+c_{n-1}+c_{0}\\ c_{2}+c_{3}+...+c_{0}+c_{1}\\ ...\\ c_{n-1}+c_{0}+...+c_{n-3}+c_{n-2}\end{pmatrix}=\lambda\begin{pmatrix}1\\ 1\\ 1\\ ...\\ 1\end{pmatrix}

(31.129)

and therefore $x=\mathbf{1}$ is an eigenvalue.

We can find the rest of the eigenvalues without too much trouble by thinking about rotations. We know that in general the $n$ th roots of unity have the rather amazing property that for any $k$ we essentially rotate them when multiplying through by $\omega^{k}$

$\displaystyle\omega_{n}^{k}(1+\omega_{n}+\omega_{n}^{2}+...+\omega_{n}^{n})$	$\displaystyle=\omega^{k}_{n}+\omega_{n}^{k+1}+\omega_{n}^{k+2}+...+\omega_{n}^% {n+k}$	(31.130)
	$\displaystyle=\sum_{j=0}^{n-1}\omega_{n}^{j+k}$	(31.131)
	$\displaystyle=\sum_{j+k<n}\omega_{n}^{j+k}+\omega_{n}^{n}+\sum_{j+k>n}\omega_{% n}^{j+k}$	(31.132)
	$\displaystyle=1+\sum_{j=0}^{n-k-1}\omega_{n}^{j}+\sum_{j=n-k+1}^{n}\omega_{n}^% {j}$	(31.133)
	$\displaystyle=1+\omega_{n}+\omega_{n}^{2}+...+\omega_{n}^{n-1}$	(31.134)

this rotation hopefully feels quite a bit like a circulant matrix. In fact consider what happens when we multiply a circulant matrix $C$ by the vector composed of the $n$ th roots of unity;

	$\displaystyle\begin{pmatrix}c_{0}+c_{1}+...+c_{n-2}+c_{n-1}\\ c_{1}+c_{2}+...+c_{n-1}+c_{0}\\ c_{2}+c_{3}+...+c_{0}+c_{1}\\ ...\\ c_{n-1}+c_{0}+...+c_{n-3}+c_{n-2}\end{pmatrix}\begin{pmatrix}1\\ \omega_{n}\\ \omega^{2}_{n}\\ ...\\ \omega^{n-1}_{n}\end{pmatrix}$		(31.145)
	$\displaystyle\quad\quad\quad=\begin{pmatrix}c_{0}+c_{1}\omega_{n}+c_{2}\omega_% {n}^{2}+...+c_{n-2}\omega_{n}^{n-2}+c_{n-1}\omega_{n}^{n-1}\\ c_{1}+c_{2}\omega_{n}+c_{3}\omega_{n}^{2}+...+c_{n-1}\omega_{n}^{n-2}+c_{0}% \omega_{n}^{n-1}\\ c_{2}+c_{3}\omega_{n}+c_{4}\omega_{n}^{2}+...+c_{0}\omega_{n}^{c_{1}}+c_{n-1}% \omega_{n}^{n-1}\\ ...\\ c_{n-1}+c_{0}\omega_{n}+c_{1}\omega_{n}^{2}+...+c_{n-3}\omega_{n}^{n-2}+c_{n-2% }\omega_{n}^{n-1}\\ \end{pmatrix}$		(31.151)

Although it looks like we are stuck in algebra soup, we are on the hunt for eigenvalues, so we must try to pull out our vector. It turns out that because the roots of unity rotate so nicely, we are in luck and we find that

\begin{pmatrix}c_{0}+c_{1}\omega_{n}+c_{2}\omega_{n}^{2}+...+c_{n-2}\omega_{n}% ^{n-2}+c_{n-1}\omega_{n}^{n-1}\\ c_{1}+c_{2}\omega_{n}+c_{3}\omega_{n}^{2}+...+c_{n-1}\omega_{n}^{n-2}+c_{0}% \omega_{n}^{n-1}\\ c_{2}+c_{3}\omega_{n}+c_{4}\omega_{n}^{2}+...+c_{0}\omega_{n}^{c_{1}}+c_{n-1}% \omega_{n}^{n-1}\\ ...\\ c_{n-1}+c_{0}\omega_{n}+c_{1}\omega_{n}^{2}+...+c_{n-3}\omega_{n}^{n-2}+c_{n-2% }\omega_{n}^{n-1}\\ \end{pmatrix}=\lambda\begin{pmatrix}1\\ \omega_{n}\\ \omega^{2}_{n}\\ ...\\ \omega^{n-1}_{n}\end{pmatrix}

(31.152)

provided that $\lambda=c_{0}+c_{1}+...+c_{n-1}$ . I am getting tired of typing so if you would like to verify this it is possible using Equation 31.130 and the above.

31.2 Fourier analysis