(e-mail address removed) (James Kendall) wrote in message
Well, "Fourier Transform" means nothing to me yet, but there were a
couple of times when I'd have found pattern removal handy (instead, I
went with median/average filtering). Plus, I'm curious out of general
principle. Any references on the subject?
Quite literally, thousands if not millions. Just type "fourier" or
"fourier transform" into Google. Unfortunately, being a highly
mathematical process most of these references get into quite complex
mathematics very quickly, which I find unfortunate because it puts many
people off learning about the approach and its strengths and uses as
well as gaining an insight into what must rank as one of the most
beautifully simple mathematical techniques yet devised. This was a
problem I had when I first encountered them more years ago than I care
to remember and they were just complex mathematics with no obvious
purpose. So I learned the mathematics "by rote" well enough to get
through examinations. I always remember an optics lecturer telling us
that we would know we understood FTs when we could switch between
thinking of a problem in the normal domain and the fourier domain at
will - at the time I thought he must be mad! It was several years
before I recognised their value, one example being the situation above,
and in doing so realised that all the complex mathematics just obscured
the core operation. Soon after I found myself doing exactly what my
optics professor had predicted - and I hadn't realised I was doing it
until someone asked me how I had resolved a certain problem without
using the detailed maths at all! It really is very simple and elegant
after all the detailed mathematics have been stripped away, so here is a
low mathematics content perspective.
The approach is named after Jean Baptiste Fourier, an incredibly adept
Frenchman. Fourier originally trained for the priesthood, but never
took his vows. He served in Napoleon's army as a scientific advisor
during its invasion of Egypt and became stranded there, along with the
rest of Napoleon's army, after the British under the command of Nelson
annihilated the French fleet in the Battle of the Nile. Returning to
France a year after Napoleon skulked back, Fourier settled to teach
mathematics and study the physics of heat flow. Amongst his many
achievements, Fourier also designed and supervised the construction of
the main highway between Grenoble and Turin - so his mathematics were
focussed on practical applications, which is often ignored when his
technique is taught these days.
The story all started 50 years before Fourier's debacle in Egypt when
the Swiss mathematician Daniel Bernoulli showed that the waveforms in a
vibrating string could be described as the sum of a series of sine and
cosine waves. Two years later another Swiss, probably the greatest
mathematician of all time: Leonhard Euler, based on Bernoulli's results,
proposed the theory that *any* waveform could be represented as a sum of
sine waves. Nobody questioned Euler's theory, but nobody used it either
until Fourier came along with a mathematical proof that Euler's Theory
was correct in 1807. However, the mathematics he used to do this was
revolutionary - so much so that the greatest mathematician of the time,
Lagrange, denounced the work of the young upstart as "impossible!".
Between himself and Laplace, the second greatest mathematician of the
day, they conspired to prevent Fourier's paper even being published!
This wasn't just professional jealousy - there is ample evidence that
both of these mathematical giants of their time simply did not
understand the completely original approach that Fourier had used, it
was so different and unique. These days though, it is a lot easier to
grasp because much of the language and concepts developed by Fourier
have found their way into common usage.
So there you have it, Fourier proved that any waveform could be
represented by the sum of a series of sine waves in the correct
proportion and phase to each other. *Every* waveform, whether it was a
simple mathematical function, the sound wave produced by a musical
instrument, waves on the surface of the ocean, radio waves - even
photographic images, which are simply two dimensional waveforms of
light. Fourier proved that all of these can be reduced to the sum of a
series of sine waves in their respective media. The process of
identifying these sine waves, their amplitudes and phases is called
Fourier Analysis, whilst the data of the amplitudes and phase for each
frequency in a given waveform is called a Fourier Transform. The plot
of such data as a function of frequency is just a frequency spectrum.
A slight confusion here is that the operation of computing the discrete
frequency components in sampled repetitive data, as opposed to analysing
the exact components in continuous free waveforms, is called a discrete
fourier transform. So the term Fourier Transform is often applied to
the operation as well as the result. There are special tricks that
permit the discrete fourier transform to run faster on computers, by
storing the results of certain functions that are repeated many times in
lookup tables and ordering the process in certain ways and these are
called Fast Fourier Transforms, or FFTs. The most famous of these is
probably the Cooley-Tukey algorithm, IBM and Princeton researchers,
first published in 1965, although it turned out to be the rediscovery of
a long lost procedure used by the German mathematician and physicist,
Gauss almost 150 years earlier. Computers are nothing new! ;-)
To begin with though, it is easier to think of the operation of fourier
analysis in one dimension, such as an audio signal. So, for example,
take a pure audio sine wave. If you plot out the audio waveform in
terms of amplitude versus time, you have a sine wave that extends as far
as you can see to the left and right. If you plot the frequency
spectrum then you get a single spike at the frequency of the sine wave
and nothing else - no other frequencies are present. That single spike
spectrum is just the Fourier Transform of the sine wave. Easy or what?
So when you look at the power spectrum of your audio amplifier or the
speakers in your hi-fi system, you are just looking at the fourier
transform of the waveforms as the hifi can reproduce them. You can see
if the hifi boosts the low frequencies in the bass more than the high
frequencies of the treble region or vice versa - very difficult to do if
you looked at how the hifi affected an audio waveform, but *much* more
meaningful to how you hear the music!
There is a slight gloss over some complication here, because the fourier
transform of the sine wave actually contains two spikes, one with a
positive frequency and one with a *negative* frequency, which might be a
little confusing at first. However a neat property of the FT is that
any real waveform (ie. a waveform which can be described only by real
numbers, not complex or imaginary numbers such as the sqrt(-1) ) has an
FT which is perfectly symmetric about 0 on the frequency axis. So all
those strange negative frequencies are just reflections of the real
frequency and can often be ignored - but remember they exist because
they are important in some cases, such as AM radio demodulation and the
analysis of aliasing.
Another fairly simple example would be a highly distorted audio signal,
a square wave. Fourier analysis shows that the square wave can be
broken down into the sum of a fundamental frequency and all of the odd
harmonics - so that is frequencies that are 3, 5, 7, 9... times the
fundamental, added in inverse proportion to their relation to the
fundamental with alternating positive and negative terms. It is more
difficult to describe in words, so in mathematical terms (sorry, but
this is the only equation!) the square wave is:
sin(f) - sin(3f)/3 + sin(5f)/5 - sin(7f)/7 +.... and so on.
If you lot this spectrum then you get a spike at the fundamental
frequency, f, with a negative spike at 3f which is 1/3rd of the size of
the fundamental, then a positive spike at 5f which is 1/5th of the size
of the fundamental and so on, with zero at all the other frequencies in
the plot. This data is just the fourier transform of the square wave.
An audio signal is just a waveform in one axis, time. However the same
technique can be extended to two and more dimensions. An image is just
a two dimensional waveform of light, and has a similar two dimensional
fourier transform. I notice yesterday's news of the death of Francis
Crick, joint Nobel prizewinner with James Watson for discovering the
structure of DNA which was achieved by x-ray crystallography - a 3
dimensional fourier analysis technique. So it is a very powerful
process.
It is important to realise that the Fourier Transform is not different
from the original waveform, it is just the same information sorted and
viewed in a different way. I like to compare the Fourier Transform to
those famous silhouettes of two faces looking towards each other. In
one view, you see white faces on a black background but blink and take a
different view and you see a black vase on a white background. It is
just the same information but you are prioritising the information and
sorting it differently to give you an alternate view of what is present
in the image. The waveform and its FT are just like that - the same
information viewed in a different way. You can change a waveform into
its frequency components by the Fourier Transform and you can change the
frequency components into the waveform by an almost identical operation
called an inverse fourier transform.
Consequently, it is no surprise that there are lots of relationships
that link the waveform and its FT together - change one and the other
must also change, perform a certain operation on one and a matching
operation is performed on the other. Almost all of these relationships
are symmetric - which is one of the beautiful things about fourier
transforms. So if an operation applied to the waveform results in a
matching operation applied to the FT then the application of that same
operation to the FT results in the same matching operation being applied
to the waveform.
I have already mentioned one of these relationships - the symmetry rule.
If a waveform has no imaginary components (ie. it is a real waveform)
then the spectrum is symmetric around zero frequency. The corollary of
this is that if the spectrum is real then the waveform is symmetric
about the time (or space) axis. Symmetry within symmetry!
It follows that an asymmetric waveform (and how many of our images are
perfect mirror reflections along the vertical and horizontal axes?) has
a spectrum which must contain complex numbers. These complex numbers
are simply a mathematical description of the amplitude and phase of the
sine wave at any frequency. The amplitude is just the absolute value of
the complex number (not the real part) and the phase is the argument,
which is the arc-tangent of the ratio of the imaginary to real part. So
moving a waveform, or image, in any direction simply changes the
proportion of real and imaginary parts in the spectrum, the phase of the
frequencies, whilst maintaining their amplitude.
Another relationship, which is quite simple to grasp, is the principle
of truncation. Remember the audio sine wave that was plotted out
against time and extended as far as you could see to the left and right?
That means it extended to infinity - the sound existed for all time.
However the spectrum was quite finite - just a single frequency was
present. This is a general principle - a finite waveform will have an
infinite fourier transform that extends to infinite frequencies.
Similarly a finite spectrum corresponds to a waveform that exists over
infinite time (or space). This can cause problems, and certainly
determines the size of FT necessary to efficiently approximate the
correct result but, generally speaking waveforms that are infinite are
repetitive, or we don't generally care if they are, so they can be
replaced by finite waveforms that are assumed to repeat - which is the
underlying principle of the discrete fourier transform.
The general rule is that very small details in the waveform correspond
to very large frequencies in the fourier transform. That is why some
operations are more easily implemented in one domain than the other -
because some functions are large in the original domain, but have small
fourier transforms, whilst others that have large fourier transforms are
small in the original waveform. It is a balance between the computation
time of the FT to get a short process time for a small function and
process time for the larger function.
This is important when it comes to sampling waveforms, including images.
The image is finite, so its spatial frequency spectrum is, by
consequence, infinite. However, another relationship is the Convolution
Theorem that I mentioned a couple of posts back. This basically says
that if you multiply two waveforms together in a piece wise manner that
the result has the same fourier transform as convolving the fourier
transforms of the two waveforms. The converse is also true, so that
Convolving is just a mathematical term for blurring one waveform by
another. At every possible position of one waveform relative to the
other, you multiply each point in one waveform with the corresponding
point in the other and adding up the sum of all of those products to get
the result for that position. Sounds complex, but it is just a
repetitive average product as one waveform slides past the other.
When you sample the image with a CCD in a scanner or a digital camera
you implement a couple of processes. There are lots of different ways
of viewing this process, but they all reduce to the same thing, and the
following is the easiest to visualise in my opinion. The first process
is that you average (blur) the image at each point by the area of the
CCD element. This produces what I call a pixel response map. Then you
actually sample the pixel response map by producing an output signal for
just certain specific points where the CCD elements are actually
centred.
That first process, the averaging or blurring by the pixel area is
convolution - so the effect on the spatial frequency spectrum of the
image is... to multiply it with the fourier transform of the pixel's
response. Since this is nominally uniform across the area of the pixel,
and the pixel is finite, the fourier transform extends to infinity,
however the high frequency components are very much less than the low
frequency ones. The outcome is that the spatial frequency spectrum of
the image detected by each pixel contains very much less high frequency
content than the original image might have. There are standard FT
results that help to visualise what is happening here. For example, if
the pixel is perfectly square with a uniform response across its area
then it turns out that the FT is a sinc function, which is
sin(pi.a.f)/(pi.a.f), where a is the width of the pixel and f is the
spatial frequency. A bigger pixel means less high frequency information
is reproduced.
The second step in the sampling process is just to output the signal for
each pixel position. This is the same as multiplying the blurred image
of the pixel response map with an array of infinitely narrow spikes,
known mathematically as delta functions. The effect of this
multiplication on the spectrum is of course... convolution. A special
property of the array of delta functions is that they have an FT which
is another array of delta functions. Whilst the spikes are separated in
the original CCD by the pixel pitch, they are separated in the FT by the
sampling frequency. As with everything else relating waveforms and
transforms, a smaller pitch means a bigger sampling frequency.
So now we have a convolution of the fourier transform of the pixel
response map with a series of spikes which are separated by the sampling
frequency. That effectively means the FT of the pixel map is replicated
at every delta function with the zero frequency point mapped to the
spatial frequency of the spike. This is quite a complex thing to
visualise if you have never tried it before, so it might help to
consider just two of these spikes, the one at the origin of the spatial
frequency scale and one at the sampling frequency. So we have two pixel
response map spectra being overlaid on top of each other, offset by the
sampling frequency.
Now, remember that first relationship of FTs, the symmetry rule? Well,
the original image was real, and the CCD elements were real, so the
pixel response map is also real - which means that the spatial frequency
spectrum of the pixel response map is... symmetric! Remember those
negative frequencies that I mentioned we had to consider - this is one
case where their existence is important.
So, now what you have is two symmetric spectra, overlaid and offset
relative to each other by the sampling frequency. That means that if
the spectrum of the pixel response map contains any spatial frequencies
which are greater than half of the sampling frequency then they will mix
with the symmetric negative spatial frequencies of the spectrum that has
been offset by the sampling frequency. Similarly, those large negative
spatial frequencies will mix with the positive spatial frequencies of
the spectrum at the origin. And this mixing occurs identically at every
delta function in the fourier transform. Furthermore, the higher the
spatial frequency is the lower it mixes into the adjacent spectra. This
is the classical proof of Nyquist's Theorem that the sampled system can
unambiguously resolve frequencies up to half of the sampling frequency.
Fairly obviously, half the sampling frequency is the point at which each
spectrum starts to overlap the adjacent one. It also shows that even
the ideal CCD array with a flat response across each pixel area and no
dead area between adjacent pixels, will always alias an otherwise
unfiltered image - because the spatial frequency response of the pixel
is still significant beyond half the sampling frequency.
Finally, coming back onto the topic, it also shows that if the dots in a
halftone screen are not sampled with sufficient resolution to have at
least two samples for each dot pitch then they will give rise to
frequencies in the spatial spectrum of the pixel image map which are
greater than half the sampling frequency and will consequently mix
inseparably with frequencies of the image content - something we all
know and recognise as aliasing and moiré. It is obvious, once you
follow the fourier processes, that there is absolutely no way to remove
this defect without also removing significant image content. It is also
clear, however, that if the dots are sampled with sufficient density
(note they need not be resolved in the strict sense) then they can
simply be removed by masking out the high spatial frequencies in the
fourier transform of the image - those corresponding to greater than
half the spatial frequency of the dots themselves, which are simply
another sampling step in the half tone process. This uniform mask in
the spatial frequency spectrum is a multiplication, so the same process
can be achieved by convolution (ie. blurring) the original image with an
appropriate filter kernel, and in most cases an adequately sized blur
will do the job just as well.