Robert Feinman said:
Multisampling has become popular with the new generation of scanners.
This is supposed to increase dynamic range and lower noise in the
densest part of the film. The mathematics of this, however, would
seem to indicate a very modest effect.
Let's assume we are scanning with a 16 bit output to maintain
"best" quality (another disputed point).
If we sample each image point twice we can effect the lowest bit
in the image. That is we might change it from a 0 to a 1 or the
reverse. If we oversample 4x we can effect the lowest 2 bits.
<pedant>
BTW - it is "affect" not "effect". The change you effect affects the
result. ;-)
Let's assume that we have calibrated our scanner so the darkest
values are around 30 and the lightest around 250. This leaves us
a little room for alteration in the image editor without clipping.
So changing the darkest 1 or 2 bits can potentially alter the 30 value
in a range of 28-32 or so. I doubt anyone will notice this in a final
print or online presentation.
If that was what multisampling did then you would be right, but it is
not what happens because your mathematics assume noise free scanning,
and noise is exactly what multisampling addresses.
There are many sources of noise on a scanner, including noise on the
supply voltages to the sensor and the analogue amplification stages,
noise on the reference voltages of the analogue to digital convertor,
noise on the supply driving the illumination source which causes the
brightness of the source to be noisy, noise on the timing circuit which
controls the exposure time of each CCD sample, and so on. So, when you
consider your scanner system scanning at 16-bit output you cannot assume
that the noise is only the quantisation noise, appearing in the least
significant bit, as you appear to have done above.
One of the dominant sources of noise, particularly for scanning negative
sources for reasons which will become obvious in a few moments, is
random noise in the arrival of photons from the light source, through
the emulsion and onto the CCD. Since the arrival of each photon is
totally unrelated to the arrival of any other (ie. the photons do not
affect each other) then it turns out that the noise on the number of
photons arriving in a given time interval is just the square root of the
total number of photons. This is just the same statistical phenomena
that governs flipping a coin. On average if you flip a coin 10 times
then you will get 5 heads and 5 tails, but as everyone knows it rarely
turns out as perfect as that. Sometimes you get 10 heads and no tails,
sometimes 3 heads and 7 tails. If you run 100 experiments to flip a
coin ten times and record the number of heads you get in each experiment
then the average will be pretty close to 5. You will also be able to
work out the average deviation from that value over all 100 experiments.
The "standard" method of computing this variation is to square the
difference between the number of heads in each experiment and the
average (which gives equal weighting to positive and negative
differences), sum these values and divide by one less than the number of
experiments (to account for the mean itself) and then calculate the
square root of the result (which returns the number to the same scale as
the original heads and tails). What you will then find is that this
noise is pretty close to the square root of the number of coins being
flipped in each experiment, in this case 10 coins, giving an average
deviation from the mean of 3.16. because this is the standard method of
computing the average deviation from the mean, it is known as the
"standard deviation". So, from 100 experiments flipping 10 coins, you
can reasonably expect an average of 5 heads +/- 3.16.
Of course, there is a variability in this noise, so that if you now run
100 experiments each of 100 tests of flipping 10 coins then you can
determine how much that standard deviation varies in practice. However,
the more experiments you run, the closer to an average of 5 you will
get, and the closer the standard deviation, or noise, will be to 3.16.
So what has this all to do with scanners and multisampling?
Well, the photons which arrive at the CCD cell, which create the voltage
that the ADC turns into a data value for that pixel, have the same
statistics. Consequently, although the average number of photons
arriving at the CCD during each exposure period will be roughly the
same, it can vary by the square root of the total number of photons.
That means that if you have two adjacent pixels with exactly the same
density on the film, or you measure the same pixel twice using
multi-sampling, you can expect the number of photons detected by the CCD
to vary between those two measurements by, on average, the square root
of the total number of photons measured in each sample. In addition,
the signal to noise ratio is also the square root.
So, if you can detect more photons in the exposure period then you can
increase the signal to noise. However, this is where you run into a
problem. The CCD converts each photon to an electron, which it stores
on a little capacitor in the cell (actually the capacitor *is* the
photodetector in a CCD) and the more electrons stored on the capacitor,
the larger the voltage that is produced - until that voltage reaches a
limiting bias voltage when each additional electron is just spilled out
onto the adjacent cell or onto a special "anti-blooming" track on the
device. You can think of this as the CCD cell simply being a bucket
which is being filled with water from a stream. You can measure the
flow rate of the stream by allowing it to flow into the bucket for a
period of time, but once the bucket is full, it just overflows and you
cannot measure any more.
Typically, a linear CCD used in a commercial grade scanner will have a
storage capacity at each cell of around 100,000 electrons before it
saturates. Ignoring the fact that the quantum efficiency (how many
electrons are produced by each photon) is typically much less than one,
this corresponds to around 100,000 photons which can be detected in the
exposure period before the cell saturates. This in turn means that the
*best* signal to noise ratio that the CCD can produce is roughly the
square root of 100,000, which is around 316:1. You will note
immediately that this is *much* less than the dynamic range of the
16-bit data range in your scanned image - in fact, the maximum signal to
noise is equivalent to about 8.5 bits. That is because this is the
noise on the maximum signal - the highlights in a slide or the shadows
on a negative image.
Of course the CCD will detect much fewer photons from denser parts of
the emulsion and the noise will again be the square root of the number
of detected photons. If you are using 16-bit data then the scanner can
respond to photon arrival rates which are 1/65536th of the saturation
level, producing around 2 electrons in the cell, with an average noise
of around 1.4. Don't think this is too strange, having 1.4 electrons of
noise - it is no different from having 3.16 heads of noise in the
experiment with the coins, even though each coin only has one head and
one tail. Of course, for such low levels of signal, other noise sources
such as some of those mentioned in the first paragraph above will
dominate this, and it is not unusual to have CCD readout noises of 10-20
electrons, which are added to the other sources. Again, this makes the
variation on a single scan pass much greater than the quantisation of
the 16-bit data range you are using, and this is what effectively
determines the Dmax of the scanner - not number of bits in the ADC as
the marketing department of some manufacturers would have you believe.
Returning to the best signal to noise in a single sample though, this is
clearly limited by the storage capacity of the CCD. What you need are
bigger CCDs - and you will notice that professional digicameras do have
much bigger CCDs than consumer cameras, for just this same reason. As
already shown, for a single sample, the best signal to noise you can
expect is equivalent to around 8.5-bits. If, however, you take two
samples and add the data together then you have produced the equivalent
of a CCD with twice the storage capacity - and improved the signal to
noise ratio by around a factor of 1.4x. If you take four samples of the
same pixel then you have quadrupled the storage capacity and thus
effectively doubled the signal to noise - you now have the equivalent of
around 9.5-bits. 16 samples gives you the equivalent of around 2-bits
of noise reduction, or an SNR equivalent to 10.5-bits.
Now, when you are scanning a slide the noise on the highlights isn't
really a problem - you naturally expect to see more noise there because
your eyes do exactly the same thing as the scanner and you cannot
perceive it. Adjacent, unsaturated, pixels from exactly the same
density of emulsion will vary by this noise, and it can bee seen by
appropriate adjustment of the levels to pull detail out of the
highlights. More perceptible, however, is noise in the shadows because
your eyes expect a lower noise floor as the photon flux decreases.
However, as we have seen, in the shadows the signal to noise is worse
because of additional noise sources. Hopefully, these noise sources are
random and uncorrelated from sample to sample and, if they are, then
they will also be reduced in effect by the square root of the number of
multisamples used.
For scanning a negative, however, the shadows in the image are the
highlights in the negative. So the best SNR in the image is in the
shadows and is only 8.5-bits on a single scan - so multisampling really
makes a significant difference scanning negatives, both improving the
signal to noise and the dynamic range of the image by the square root of
the number of samples taken.
As you can see from the above, the effect is a lot more than just a few
bits in the entire range that your assessment suggested. Sampling the
same pixel twice does not limit the change to only the least significant
bit - that is just bad mathematics. Each sample is like flipping a coin
a number of times (about 100,000 times). On average, the difference
between samples is the square root of the number of flips, not just
+/-1, and that is essentially where your assessment went wrong.