A "slanted edge" analysis program

  • Thread starter Thread starter Lorenzo J. Lucchini
  • Start date Start date
L

Lorenzo J. Lucchini

Well, my Imatest trial is up, and I'm not going to buy it (though it is
a very nice program that I'd recommend!), but I'd still like to play
with ESF, PSF, MTF and all that.
Specifically, I'm still very interested in findout out what the "best
sharpening" for my scanner is by analysing the edge spread function.

There is another interesting "slanted edge tester" at
http://www.i3a.org/downloads_iso_tools.html (if I'm not mistaken,
Imatest is actually based on its code), but it doesn't give nearly as
much information as Imatest does... even though perhaps the MTF alone
could be turned into the correct deconvolution for my scanner in the
hands of "some of the people here"! ;-)


But. In the last few days I've been busy writing my own "slanted edge"
program. It now sort of works, meaning that it doesn't (often) crash,
and that it gives results that are similar to what Imatest or SFRWin give.
Similar, not really identical... that's why I would be glad if Bart,
Kennedy, or anybody who's into this kind of things could help me a little.


The program can be found at
http://ljl.150m.com/slantededge-alpha1.tar.gz

and I've created a SourceForge project at
http://sourceforge.net/projects/slantededge/

with sourcecode in the CVS.


I'll explain briefly what the program does and in what order it does it,
as the source is not very well commented, er, at the moment.

1) Read in an RGB image
2) We work with vertical edges, so rotate the image if needed
3) Besides the "red", "blue" and "green" channels, create a "gray"
channel that is the average of the other three
4) Normalize the image, so that 0.5% of the pixels clip down, and 0.5%
clip up
5) For each line in the image, find the point with the max adjacent
pixel difference (should be the edge)
6) By doing a least squares regression on those points, find a straight
line that ought to best represent the edge
7) Associate a distance from that line to each pixel in the image. The
function PixelValue(Distance) approximates the edge spread function.
8) Use "some kind" of local regression to create a uniformly-spaced
version of the ESF, from the data described above.
9) Derive the line spread function from the edge spread function:
LSF(i)=ESF(i+1)-ESF(i)
10) Apply a Hanning window to the LSF
11) Take the discrete Fourier transform or the resulting data


Note that, at the moment, the input image must be an RGB, 8-bit ASCII
("plain") PPM file. These can be created using "*topnm" and
"pnmtoplainpnm" from the NetPBM suite, or by using the GIMP.
Type --help for some uncomprehensive help.


I have a lot of doubts and questions to ask, but first I'd like to get
an overall look at the program by someone competent, to find out what I
have done *badly* wrong (something awful is bound to be there).

Please keep in mind that I knew nothing about regressions, spread
functions or Fourier transforms two weeks ago -- and I suppose I don't
know that much more now.
I just read some Internet source and implemented what I thought they meant.


by LjL
(e-mail address removed)
 
Lorenzo J. Lucchini said:
Lorenzo said:

Since http://ljl.150m.com appears to be down,

I was wondering about that, so it wasn't my setup afterall ;-)

Although the source maybe helpful, I'm running on a Windows XP
platform. I don't mind looking into the project, but I have too little
time to compile it myself (I know compilation itself doesn't take
long, but I'd have to install a compiler first). A pre-compiled Win32
version might help.
SourceForge's "web" CVS will also take some hours to update, even
though the files are really there.

Didn't find them yesterday, they are available now.

Bart
 
Bart said:
Lorenzo said:

Since http://ljl.150m.com appears to be down,

I was wondering about that, so it wasn't my setup afterall ;-)

You know, it's free hosting... I'm starting to worry, though, it's been
down for quite a while now.

Hope this one stays up! Anyway, I guess I'll make a release on
SourceForge after I've fixed the really bad bugs.
Although the source maybe helpful, I'm running on a Windows XP platform.
I don't mind looking into the project, but I have too little time to
compile it myself (I know compilation itself doesn't take long, but I'd
have to install a compiler first). A pre-compiled Win32 version might help.

I know. The main problem is that I'm using the FFTW library for taking
Fourier transforms, and while there seems to be a Windows version
available (several, actually), well... the site is down. The web doesn't
like me apparently.

I'll see what I can do, perhaps I can just write a slow, simple DFT
myself (I have no idea how difficult it is, I'll have to read a bit) as
a compile alternative to FFTW.


Anyway, do you have a SourceForge account? With one, I can just add you
to the project, and then you'll be able to access SF's shell servers.
This way you could easily run - and even easily compile - the program
remotely, without installing anything special on your computer.


Now some of the most important questions I have:

- What is Imatest's "10%" and "90%"? Initially, I took these as the
minimum and maximum pixel values that can constitute the "edge". But it
appears clear that the showed ESF also contains lower and higher values;
besides, it always seem to go from -6 pixels to +8 pixels from the edge
center. Is there a reason for this?
So, now I suppose "10%" and "90%" are simply used to compute (guess
what) the 10%-90% rise.
Which in turns call for: should I normalize the image before doing
anything else? I currently normalize so that 0.5% of pixels clip to
black and 0.5% clip to white.

- Is it ok to take a (single) Fourier transform of the Hanning-windowed
LSF? Without windowing, I get weird results, but with windowing, I'm
afraid I'm affecting the data. My MTF's always look "smoother" than
Imatest's and SFTWin's ones, and too high in the lower frequencies.

- How many samples should my ESF/LSF have? I understand that it only
depends on how high you want your frequencies to be -- i.e., if I want
to show the MTF up to 4xNyquist, I should have 4x more samples than
there are real pixels. Is this correct?

- How do I reduce frequencies spacing in the DFT? If I just transform
the LSF (or a Hanning'ed LSF), I get ridiculously low frequencies
resolution. What I'm doing now to overcome this is... add a lot of
zeroes at the end of the original LSF. But, somehow, I think this is
kinda stupid.

- The method I'm currently using for getting a smooth,
uniformely-spacing sampled ESF from the point I have is naive and very
slow. The sources I've read suggest using "LOESS curve fitting" for
this. I've had some trouble finding good references about this, and it
seems very complicated anyway. The question is: is something simpler
good enough?


by LjL
(e-mail address removed)
 
Lorenzo J. Lucchini said:
Well, my Imatest trial is up, and I'm not going to buy it (though it
is a very nice program that I'd recommend!), but I'd still like to
play with ESF, PSF, MTF and all that.

Yes, once you really grasp what you are actually looking at, it will
help with improving the results. e.g. by allowing to optimize capture
sharpening.
Specifically, I'm still very interested in findout out what the
"best sharpening" for my scanner is by analysing the edge spread
function.

Exactly. There are several possibilities to use the ESF to check for
that (e.g. edge halo), and the PSF can be used as input in several
deconvolution programs. The MTF allows to compare systems in an
unambiguous way (assuming test conditions are under control).
There is another interesting "slanted edge tester" at
http://www.i3a.org/downloads_iso_tools.html (if I'm not mistaken,
Imatest is actually based on its code),

Not really based on it, but it helped as an inspiration and discussion
tool with, amongst others, the programmers of the original ISO demo
sourcecode. Imatest uses the basic idea of the slanted edge target, as
it is described in the relevant ISO standards 16067-1&2 for scanners,
and 12233 for digital still cameras.

but it doesn't give nearly as much information as Imatest does...
even though perhaps the MTF alone could be turned into the correct
deconvolution for my scanner in the hands of "some of the people
here"! ;-)

The MTF is more suited for judging the effect of such a sharpening on
the various spatial frequencies, and the associated risk of aliasing
artifacts when down-sampling afterwards. The PSF is needed for
deconvolution (or for the creation of a high-pass convolution kernel).
But. In the last few days I've been busy writing my own "slanted
edge" program. It now sort of works, meaning that it doesn't (often)
crash, and that it gives results that are similar to what Imatest or
SFRWin give.
Similar, not really identical... that's why I would be glad if Bart,

Reporting for duty ;-)
Kennedy, or anybody who's into this kind of things could help me a
little.

I'll add some program related comments in a separate message.

Bart
 
Lorenzo said:
Bart said:

I know. The main problem is that I'm using the FFTW library for taking
Fourier transforms, and while there seems to be a Windows version
available (several, actually), well... the site is down. The web doesn't
like me apparently.

Fortunately, I see that CygWin comes with FFTW. I guess it will be easy
enough then.

by LjL
(e-mail address removed)
 
SNIP
and I've created a SourceForge project at
http://sourceforge.net/projects/slantededge/

with sourcecode in the CVS.


I'll explain briefly what the program does and in what order it does
it, as the source is not very well commented, er, at the moment.


I've added a few comments, hopefully it will help to better understand
the reasons behind the steps.
1) Read in an RGB image
2) We work with vertical edges, so rotate the image if needed

This is only done to simplify the calculations. All that's required is
to get the image data in a uniform orientation, so all subsequent
subroutines will know what to expect (which avoids rechecking
assumptions).
3) Besides the "red", "blue" and "green" channels, create a "gray"
channel that is the average of the other three

The image code values should be linarized at this stage, so
film/sensor non-linearity and gamma adjustments can't influence the
calculations.

It is customary to use a different weighting than the (R+G+B)/3
average. The ISO suggests the calculation of a luminance channel, so a
ratio of approx. 3R:6G:1B is closer to what we experience as
luminance.
4) Normalize the image, so that 0.5% of the pixels clip down, and
0.5% clip up

I think that, especially on non-linear image codes, this will
influence the MTF results, because the contrast is expanded. On a
perfectly symmetrical brightness distribution its effect will be
small, but the possibility of clipping in later stages should be
avoided. Also a check for at least 20% edge modulation should be made,
in order to avoid a too low input S/N ratio.

It is however perfectly normal to normalize the ESF output to a range
between 0.0 and 1.0, and later to normalize the SFR/MTF to 1.0 (100%)
at zero spatial frequency.
5) For each line in the image, find the point with the max adjacent
pixel difference (should be the edge)

Not necessarily, that is just the maximum gradient and that need not
be the same as the edge.
The ISO suggests to combine this with with your step 9, and determine
the centroid of the LSF (by calculating the discrete derivative of the
ESF). The centroids can be used for regression.
The derivative suggested by the ISO is:
"for each line of pixels perpendicular to the edge, the edge is
differentiated using the discrete derivative "-0,5 ; +0,5", meaning
that the derivative value for pixel "X" is equal to -1/2 times the
value of the pixel immediately to the left, plus 1/2 times the value
of the pixel to the right".
They then specify something different in their centroid formula, but
perhaps they changed that in the official standard.

There is another calculation method possible. That calculation is done
in the orthogonal direction, so almost along the edge instead of
across the edge.
6) By doing a least squares regression on those points, find a
straight line that ought to best represent the edge
7) Associate a distance from that line to each pixel in the image.

The ISO method shifts each row of the ESF by the calculated amount
from the regression, but uses quarter pixel bins. This produces a 4x
oversampling per pixel position.
The function PixelValue(Distance) approximates the edge spread
function.
8) Use "some kind" of local regression to create a uniformly-spaced
version of the ESF, from the data described above.
9) Derive the line spread function from the edge spread function:
LSF(i)=ESF(i+1)-ESF(i)

See earlier remark, and provisions need to be made to detect multiple
maxima (caused by noise/graininess).
10) Apply a Hanning window to the LSF

That is needed to reduce noise and the discontinuity at the borders of
the LSF.
11) Take the discrete Fourier transform or the resulting data

And take the Modulus, and normalize.
Note that, at the moment, the input image must be an RGB, 8-bit
ASCII ("plain") PPM file. These can be created using "*topnm" and
"pnmtoplainpnm" from the NetPBM suite, or by using the GIMP.
Type --help for some uncomprehensive help.

;-) Room for improvement ...
I have a lot of doubts and questions to ask, but first I'd like to
get an overall look at the program by someone competent, to find out
what I have done *badly* wrong (something awful is bound to be
there).

I'll have to study the source code before I can comment.
Please keep in mind that I knew nothing about regressions, spread
functions or Fourier transforms two weeks ago -- and I suppose I
don't know that much more now.

Isn't that the fun of programming, it forces you to describe the
principles in detail. Learning new stuff is inevitable.
I just read some Internet source and implemented what I thought they
meant.

With the program (to be) available to the public, the chance of
helping hands increases. The source code you have probably read will
give a good start. As always, the devil is in the details, but that
can be overcome.

Bart
 
SNIP
I'll see what I can do, perhaps I can just write a slow, simple DFT
myself (I have no idea how difficult it is, I'll have to read a bit)
as a compile alternative to FFTW.

http://astronomy.swin.edu.au/~pbourke/other/dft/ and
http://www.library.cornell.edu/nr/bookcpdf/c12-2.pdf has C code for a
DFT function.
http://www.library.cornell.edu/nr/cbookcpdf.html chapter 12 has more
background on Fourier transforms, and chapter 13.1 has routines and
backgound on Deconvolution (although there are better functions for
image restoration).
Anyway, do you have a SourceForge account? With one, I can just add
you to the project, and then you'll be able to access SF's shell
servers. This way you could easily run - and even easily compile -
the program remotely, without installing anything special on your
computer.

No, I don't have an SF account.
Now some of the most important questions I have:

- What is Imatest's "10%" and "90%"? Initially, I took these as the
minimum and maximum pixel values that can constitute the "edge".

No, there are several ways to quantify an "infinite" function in a
single number. On an Edge Response Function it is a common procedure
to choose the width between the 10% and 90% response points. On a
Gaussian type of function, a Full Width at Half of Maximum is often
used.
But it appears clear that the showed ESF also contains lower and
higher values; besides, it always seem to go from -6 pixels to +8
pixels from the edge center. Is there a reason for this?

I'm not sure about the exact reason for the choice, but I assume it
has to do with the shape of some ESFs that Norman Koren encountered
when he developed the program. The actual data in Imatest is recorded
from -6 to +10 at 0.25 intervals.
So, now I suppose "10%" and "90%" are simply used to compute (guess
what) the 10%-90% rise.

Actually the width in pixels between the two response points.
Which in turns call for: should I normalize the image before doing
anything else? I currently normalize so that 0.5% of pixels clip to
black and 0.5% clip to white.

No, it is better to only normalize the output curves but leave the
original data (which is where the truth is hidden) as it is. By
manipulating the image data itself, you run the risk of changing the
data (in case of non-linear response data), and of introducing
quantization errors (by rounding up/down half a bit).
- Is it ok to take a (single) Fourier transform of the Hanning-
windowed LSF?

Yes, it's not that much data, so the DFT is fast.
Without windowing, I get weird results, but with windowing, I'm
afraid I'm affecting the data.

You'll have to window because of the abrupt edges. That's reality in
Fourier transforms, we deal with a small subset of the real data which
reaches out to infinity.
My MTF's always look "smoother" than Imatest's and SFTWin's ones,
and too high in the lower frequencies.

We'll have to see, but make sure you compute the magnitude of the
transform (take the "Modulus" of the DFT).
- How many samples should my ESF/LSF have? I understand that it only
depends on how high you want your frequencies to be -- i.e., if I
want to show the MTF up to 4xNyquist, I should have 4x more samples
than there are real pixels. Is this correct?

No. In the ISO method you would calculate an ESF for each line (row)
of pixels that crosses the edge. The average of all those ESFs is
produced after shifting each row in proportion with the centroid
regression. It is at that point, the shifting, that you bin the pixels
in an array that's 4x wider than the edge crop. That allows you to bin
the centroid with a 4x higher (=quarter pixel) resolution. After that
it's just statistics, larger numbers of ESFs make a more likely
approximation of the actual ESF.

The ISO takes one additional precaution, they take an integer number
of phase rotations. That means that if you e.g. calculated a slope of
1 pixel for every ten rows, then they take an integer multiple of ten
rows, starting at the top and trunkating the image data at the bottom.
- How do I reduce frequencies spacing in the DFT?

I'm not sure what you mean, but it may have to do with the previous
quarter pixel binning.
SNIP
- The method I'm currently using for getting a smooth,
uniformely-spacing sampled ESF from the point I have is naive and
very slow. The sources I've read suggest using "LOESS curve fitting"
for this. I've had some trouble finding good references about this,
and it seems very complicated anyway.

It apparently is a kind of locally weighted regression with reduced
sensitivity for outliers.
The question is: is something simpler good enough?

Part of it may have to do with the quarter pixel sampling/binning.
If you just want to fit a monotone curve through regularly sampled
points, a simple interpolation (Cubic or Hermite) seems good enough to
me:
http://astronomy.swin.edu.au/~pbourke/other/interpolation/

Bart
 
Bart said:
SNIP

[snip]

I've added a few comments, hopefully it will help to better understand
the reasons behind the steps.
1) Read in an RGB image
2) We work with vertical edges, so rotate the image if needed

This is only done to simplify the calculations. All that's required is
to get the image data in a uniform orientation, so all subsequent
subroutines will know what to expect (which avoids rechecking assumptions).

Sure. I only specified this because I think I used terms like "lines",
"rows" etc. later on.
The image code values should be linarized at this stage, so film/sensor
non-linearity and gamma adjustments can't influence the calculations.

Yes, I am currently ignoring gamma, as my test images are gamma=1.0 anyway.
If I'm not mistaken, though, this all boils down to a
"Pixel=InputPixel^Gamma" instead of just "Pixel=InputPixel", so I'll be
be very easy to add.
It is customary to use a different weighting than the (R+G+B)/3 average.
The ISO suggests the calculation of a luminance channel, so a ratio of
approx. 3R:6G:1B is closer to what we experience as luminance.

I suspected this. This is also easy to do, so I'll fix it right now.
I think that, especially on non-linear image codes, this will influence
the MTF results, because the contrast is expanded. On a perfectly
symmetrical brightness distribution its effect will be small, but the
possibility of clipping in later stages should be avoided.

I'm not sure I understand why it can affect the MTF, but I'll take your
word for it.

I've included this normalization step mainly because of the way I decide
how many pixels in each line will be considered part of the "edge", i.e.
contribute to the ESF: what I did is consider every pixel above a
certain threshold and below another (e.g. 10% and 90%, or more recently
5% and 95% since the former didn't work out well) part of the edge.

But (also judging from your other message) it seems I was completely off
track about this. I suppose I can just do like Imatest does, and say
that 10 pixels on the right and 6 on the left of the edge center will be
"part of the edge".

This way, the normalization process becomes unnecessary.
Also a check
for at least 20% edge modulation should be made, in order to avoid a too
low input S/N ratio.

I'm taking note, but I think I'll leave such checks for later when the
program is somewhat stable.
It is however perfectly normal to normalize the ESF output to a range
between 0.0 and 1.0, and later to normalize the SFR/MTF to 1.0 (100%) at
zero spatial frequency.

Currently, I'm normalizing the ESF, the LSF and the MTF to between 0.0
and 1.0.
Not necessarily, that is just the maximum gradient and that need not be
the same as the edge.
The ISO suggests to combine this with with your step 9, and determine
the centroid of the LSF (by calculating the discrete derivative of the
ESF). The centroids can be used for regression.

The derivative suggested by the ISO is:
"for each line of pixels perpendicular to the edge, the edge is
differentiated using the discrete derivative "-0,5 ; +0,5", meaning that
the derivative value for pixel "X" is equal to -1/2 times the value of
the pixel immediately to the left, plus 1/2 times the value of the pixel
to the right".

Sorry if I'm thick, but mathematics isn't my best friend...
You're implying that, for each line of pixels, the edge center(oid?)
will be the absolute maximum of the above derivative, aren't you?

But isn't the absolute maximum of the derivative precisely the maximum
gradient?

(Though the formula I use is currently simpler than the one you cite:
simply y'=y[i+1]-y)
[snip]
6) By doing a least squares regression on those points, find a
straight line that ought to best represent the edge
7) Associate a distance from that line to each pixel in the image.

The ISO method shifts each row of the ESF by the calculated amount from
the regression, but uses quarter pixel bins. This produces a 4x
oversampling per pixel position.

This doesn't sound like a bad idea at all. It'd probably simplify things
a lot, expecially with my "local regression" problems...
See earlier remark, and provisions need to be made to detect multiple
maxima (caused by noise/graininess).

What kind of provisions?
[snip]
11) Take the discrete Fourier transform or the resulting data

And take the Modulus, and normalize.

Yes, I forgot to mention these steps, but they're done by the program.
;-) Room for improvement ...

I know :-) But I was concentrating more on the mathematical aspects
now... after all, I *am* able to write code that loads an image file --
well, I can take some time, but I can -- while to be sure I can manage
to compute an MTF or things like that, I have to try first...
[snip]
Please keep in mind that I knew nothing about regressions, spread
functions or Fourier transforms two weeks ago -- and I suppose I don't
know that much more now.

Isn't that the fun of programming, it forces you to describe the
principles in detail. Learning new stuff is inevitable.

Sure, one of the reasons why I didn't just uploads some edges and ask
you to do my homework on them with Imatest! ;-)

Even though the SourceForge description currently says little more than
"calculates the MTF from a slanted edge", ultimately I'd like this
program to do automatic deconvolution (or whatever is best) of images
based on the edge results.

Like, "to sharpen an image, first scan a cutter or razor blade using the
'--edge' option, then run the program on the image with '--sharpen'".
Would be nice.
With the program (to be) available to the public, the chance of helping
hands increases.

Hope so!
The source code you have probably read will give a good
start. As always, the devil is in the details, but that can be overcome.

To be honest, I've read very little source, if any (well, except the
FFTW tutorial).

My main resource has been
http://www.isprs.org/istanbul2004/comm1/papers/2.pdf

where I took the evil alternative to the "4x bins" that I'm currently
using, with all the regression nighmares it brings ;-)
But it was an interesting document, anyway.


by LjL
(e-mail address removed)
 
Lorenzo said:
Lorenzo said:
Bart said:

I know. The main problem is that I'm using the FFTW library for taking
Fourier transforms, and while there seems to be a Windows version
available (several, actually), well... the site is down. The web
doesn't like me apparently.

Fortunately, I see that CygWin comes with FFTW. I guess it will be easy
enough then.

The new archive at
http://ljl.741.com/slantededge-alpha2.tar.gz
or
http://ljl.150m.com/slantededge-alpha2.tar.gz
now includes a Windows executable, as well as a test image.
Also, the program now uses luminance instead of RGB average, can be told
to gamma-correct, and doesn't normalize the image anymore.


At
http://ljl.741.com/comparison.gif
there is a a graph showing the MTF calculated both by my program and
SFRWin, from the test image included in the archive.


by LjL
(e-mail address removed)
 
I suspected this. This is also easy to do, so I'll fix it right now.

I've only been skimming this thread and not paying too much attention
because the MTF of my scanner is what it is and there's nothing I can
do about it... So, with that in mind...

In general, if you're doing measurements it makes much more sense to
use the average. You are *not* measuring your *subjective*
*perception* of the pixels but their *objective values*.

Applying perceptive weighing is counterproductive in this case and
will only skew the results. Or, as I like to say, "corrupt" them! ;o)

Actually, what I would do is measure each channel *separately*! That's
the only measurement that makes sense especially if you have a scanner
with 3 separate light sources. But even if you don't, different
filters can also potentially cause trouble. In either case, even
averaging would skew the results.

Don.
 
Don said:
I've only been skimming this thread and not paying too much attention
because the MTF of my scanner is what it is and there's nothing I can
do about it... So, with that in mind...

Well, but I don't plan to stop here. What I'd mostly like to obtain from
all this is a way to "sharpen" my scans using a function that is
tailored to my scanner's specifics (rather than just "unsharp mask as I
see fit").

So, you see, there is a practical application of the measurements I can
obtain, it's not just about knowing how poor the scanner's resolution is.
In general, if you're doing measurements it makes much more sense to
use the average. You are *not* measuring your *subjective*
*perception* of the pixels but their *objective values*.

Applying perceptive weighing is counterproductive in this case and
will only skew the results. Or, as I like to say, "corrupt" them! ;o)

I'm not sure. Besides Bart, I think I've read somewhere else that
luminance should be used in this process. Perhaps it's even in the ISO
recommendations.

And why do you say I'm measuring the "objective values" of the pixels
instead of their "perceptual values"? I'm mostly trying to measure
resolution, in the form of the MTF. People usually cite the MTF50 and
the MTF10, because these are points where it *makes perceptual sense* to
measure: MTF10 is about the point where the human eye cannot discern
contrast anymore, Bart said.

So you see that I'm *already* doing measurements that are inherently
"perceptual". So why not be coherent and keep this in mind throughout
the process?

In any case, it makes sense to conform to what other programs of this
kind do, so that the results can be easily compared.

Perhaps I can put an option to use plain RGB average instead of
luminance. But anyway...
Actually, what I would do is measure each channel *separately*!

.... I'm doing this already.
The "gray" channel is measured *in addition* to the other three
channels, and is merely a convenience.

by LjL
(e-mail address removed)
 
Bart said:
SNIP


http://astronomy.swin.edu.au/~pbourke/other/dft/ and
http://www.library.cornell.edu/nr/bookcpdf/c12-2.pdf has C code for a
DFT function.
http://www.library.cornell.edu/nr/cbookcpdf.html chapter 12 has more
background on Fourier transforms, and chapter 13.1 has routines and
backgound on Deconvolution (although there are better functions for
image restoration).

Thanks, good pointers!
Now I can compile for Windows without using a custom DFT, since there is
FFTW in Cygwin, but I'll probably still include a custom transform
procedure in case someone wants to compile without FFTW installed.

It seems that I can't just copy&paste the code (at least from the first
site), though, as it's not public domain or GPL. I guess I'll have to
change some variable names ;-)

What do you propose for image restoration, instead of deconvolution?
("image restoration" obviously being restricted to the things you can do
when you have an ESF or PSF)
[snip]
But it appears clear that the showed ESF also contains lower and
higher values; besides, it always seem to go from -6 pixels to +8
pixels from the edge center. Is there a reason for this?

I'm not sure about the exact reason for the choice, but I assume it has
to do with the shape of some ESFs that Norman Koren encountered when he
developed the program. The actual data in Imatest is recorded from -6 to
+10 at 0.25 intervals.
So, now I suppose "10%" and "90%" are simply used to compute (guess
what) the 10%-90% rise.

Actually the width in pixels between the two response points.

So in your opinion it's ok if I just consider an arbitrary number of
pixels (like Imatest does) as constituting "the edge", without going to
great length trying to have the program make an educated guess?
No, it is better to only normalize the output curves but leave the
original data (which is where the truth is hidden) as it is. By
manipulating the image data itself, you run the risk of changing the
data (in case of non-linear response data), and of introducing
quantization errors (by rounding up/down half a bit).

Shouldn't incur in quantization errors, as I'm converting the image to
floating point as soon as it's loaded (*while* loading it, actually).
But anyway, I've removed this step from the current build.
Yes, it's not that much data, so the DFT is fast.


You'll have to window because of the abrupt edges. That's reality in
Fourier transforms, we deal with a small subset of the real data which
reaches out to infinity.

Yes, but there are two other options I've considered:

1) taking *many* DFTs of small (Hanning-windowed) pieces of the LSF, and
then average them together. Wouldn't this avoid the change of LSF shape
that using a single, big window may cause?

2) not using any window, but "cropping" the LSF so that the edges are
(very near) zero. Would this have any chances of working? It would
completely avoid changing the LSF's shape.
We'll have to see, but make sure you compute the magnitude of the
transform (take the "Modulus" of the DFT).

Yes, I do this.
MTF=SquareRoot(ImaginaryPart^2+RealPart^2)

Well, but it ends up like this anyway, doesn't it? Anyway, the method
I'm using comes from the document I pointed to in the other article, so
it shouldn't be *too* stupid.

Still, the method you describe sounds much simpler to implement, so I
guess I'll go for it.
In the ISO method you would calculate an ESF for each line (row) of
pixels that crosses the edge. The average of all those ESFs is produced
after shifting each row in proportion with the centroid regression. It
is at that point, the shifting, that you bin the pixels in an array
that's 4x wider than the edge crop. That allows you to bin the centroid
with a 4x higher (=quarter pixel) resolution. After that it's just
statistics, larger numbers of ESFs make a more likely approximation of
the actual ESF.

Let us see if I've understood this.

What you mean is, for each line "y", for each pixel "x", do

ESF[ (x*4) + (Distance(Pixel[x][y], Centroid[y]) % 4) ] +=
Pixel[x][y];

and then, at the end, divide ESF = ESF/y, to get the average
(well, or just normalize the ESF I have).

By the way, "%" is "modulo", and "a += b" means "a = a + b".
No offense meant ;-) but I don't know how fluent you are in C (and
others who read might not be, anyway).
The ISO takes one additional precaution, they take an integer number of
phase rotations. That means that if you e.g. calculated a slope of 1
pixel for every ten rows, then they take an integer multiple of ten
rows, starting at the top and trunkating the image data at the bottom.


I'm not sure what you mean, but it may have to do with the previous
quarter pixel binning.

Not sure, no.

What I mean is this: imagine your line spread function contains 128
values. The resulting DFT will contain 64 values, which is very few: the
"frequency resolution", or "frequency spacing" - or whatever it should
be called - of this MTF will be much worse than that provider by Imatest
or SFRWin.

Oversampling the LSF doesn't help: even if it's made of 1024 values, for
example, the DFT will now simply contain higher frequencies, but the
spacing between frequencies will remain the same.
(and in fact, you say that Imatest only oversamples 4x)

Now, I have read that the DFT really already "contains" everything it
can contain.
Still, very little is visible.

For example, look again at
http://ljl.741.com/scans/fig-blade2.gif
The MTF looks very detailed, as there are "jaggies" for example between
x=0..20 and y=0.2..0.6 .

The only viable way I have found to obtain this kind of precision is to
zero-pad the input LSF like crazy. But it doesn't seem a very smart or
elegant solution!
[snip LOESS]
[snip]

The question is: is something simpler good enough?

Part of it may have to do with the quarter pixel sampling/binning.
If you just want to fit a monotone curve through regularly sampled
points, a simple interpolation (Cubic or Hermite) seems good enough to me:
http://astronomy.swin.edu.au/~pbourke/other/interpolation/

I'm afraid interpolation isn't really going to make it. If I
scatter-plot my edge data, I don't see many outliers, but I do see a
very "thick" "nebula" of points -- I'm not sure I'm explaining myself.

But anyway, I better try the "4x bins" method before venturing into
this. It that one just works out of the box, then who cares about local
regressions anymore!


by LjL
(e-mail address removed)
 
Lorenzo said:
[snip]

Let us see if I've understood this.

What you mean is, for each line "y", for each pixel "x", do

ESF[ (x*4) + (Distance(Pixel[x][y], Centroid[y]) % 4) ] +=
Pixel[x][y];

and then, at the end, divide ESF = ESF/y, to get the average
(well, or just normalize the ESF I have).


Sorry, probably I should have specified better what I meant with
Centroid[y]: not the position on line y where the edge "appears to be",
but the edge position on line y taken *from the regression*.

That is, first I do
for(y) {
ApproximatedCentroid[y]=FindCentroid(y);
}

Then,
for(y) {
RealCentroid[y]=PointInRegression(ApproximatedCentroid, y);
}

and then use the ReadCentroid array in the "bins" code. I suppose using
the ApproximatedCentroid array would make no sense.


by LjL
(e-mail address removed)
 
Lorenzo said:
ESF[ (x*4) + (Distance(Pixel[x][y], Centroid[y]) % 4) ] +=
Pixel[x][y];

Sorry again... this should be

ESF[ (x*4) + (Distance(Pixel[x][y], Centroid[y]) * 4) % 1) += ...

i.e. I should multiply the distance from the edge center by 4, and then
take only the fractionary part.

This way, if the pixel is an integer number of pixels away from the edge
center, it will end up in the "first bin";
if it's an integer number of pixels away, plus 0.25, it will end up in
the "second bin", and so on.


I hope this time I've got it right :-)

by LjL
(e-mail address removed)
 
SNIP
I'm not sure. Besides Bart, I think I've read somewhere else that
luminance should be used in this process. Perhaps it's even in the
ISO recommendations.

The ISO allows to test whatever one wants, single channels or weigthed
combinations of more than one, whatever suits the purpose, but a
formal test of a device should include R+G+B measurements. However, to
quote them "If desired, a luminance resolution measurement may be made
on a luminance signal formed from an appropriate combination of the
colour records".
Since the purpose is sharpening (which should be done in the Luminance
channel if you want to avoid colored artifacts), it only makes sense
to use a weighting that simulated the luminance sensitivity of the
human eye.

Imatest also calculates a Y channel for luminance (and that was not an
uninformed choice), as it is the most significant for the sensation we
call 'sharpness'. With human eyes, color resolution is much lower than
Luminance resolution.
And why do you say I'm measuring the "objective values" of the
pixels instead of their "perceptual values"? I'm mostly trying to
measure resolution, in the form of the MTF. People usually cite the
MTF50 and the MTF10, because these are points where it *makes
perceptual sense* to measure: MTF10 is about the point where the
human eye cannot discern contrast anymore, Bart said.

You don't have to take my word, but this is what the ISO says:
"A very good correlation between limiting visual resolution and the
spatial frequency associated with a 0,10 SFR response has been found
experimentally. Should this frequency exceed the half-sampling
frequency, the limiting visual resolution shall be the spatial
frequency associated with the half-sampling frequency".

SNIP
In any case, it makes sense to conform to what other programs of
this kind do, so that the results can be easily compared.

There are many different opinions on what the mix should be.
If you want to exactly match Imatest, you could use
Y=0.3*R+0.59*G+0.11*B
(http://www.imatest.com/docs/sfr_instructions2.html almost
halfway down the page under Channel).

Other researchers use L=0.299R+0.587G+0.114B .
And Luminance weighting according to ITU-R BT.709 is:
Y=0.2125*R+0.7154*G+0.0721*B
which comes close(r) to:
<http://hyperphysics.phy-astr.gsu.edu/hbase/vision/efficacy.html#c1>

Whatever the choice (ultimately user selectable would be best for
flexibility, but makes comparisons more hazardous), I think Human
perception should carry some weight when the goal is to optimize
sharpening.

Bart
 
Bart said:
[snip]

There are many different opinions on what the mix should be.
If you want to exactly match Imatest, you could use
Y=0.3*R+0.59*G+0.11*B
(http://www.imatest.com/docs/sfr_instructions2.html almost
halfway down the page under Channel).

Other researchers use L=0.299R+0.587G+0.114B .
And Luminance weighting according to ITU-R BT.709 is:
Y=0.2125*R+0.7154*G+0.0721*B
which comes close(r) to:
<http://hyperphysics.phy-astr.gsu.edu/hbase/vision/efficacy.html#c1>

Whatever the choice (ultimately user selectable would be best for
flexibility, but makes comparisons more hazardous), I think Human
perception should carry some weight when the goal is to optimize
sharpening.

I think I'll go for user selectable, with a default that's recommended
for comparing others' results.

But all this made me wonder about something else: would it make any
sense to compare the edge *position* of each (red, green and blue)
channel with the edge position in the luminance channel?

I mean. SFRWin gives "red", "blue" and "green" color offsets (for
measuring "color fringing"), but the "green" offset is always zero, as
the other two channels are compared to green.

Would comparing the three channels to luminance, instead, have any
advantage over SFRWin's approach? I don't remember what Imatest does here.


by LjL
(e-mail address removed)
 
Lorenzo J. Lucchini said:
Bart van der Wolf wrote: SNIP

Yes, I am currently ignoring gamma, as my test images
are gamma=1.0 anyway.

For real accurate results, that remains to be verified ...
Testing a (transparent) step wedge may/will reveal 'interesting'
features of hardware *and* of scanner driver software.
If I'm not mistaken, though, this all boils down to a
"Pixel=InputPixel^Gamma" instead of just "Pixel=InputPixel",
so I'll > be be very easy to add.

Yes, for straight Gamma only (sRGB profiled images use a
'slope-limited' Gamma). Also beware that Gamma adjustment
may mean 1/Gamma, depending on what is being adjusted where.
This does assume that Gamma is the only non-linearity.

SNIP
I'm not sure I understand why it can affect the MTF, but I'll take
your word for it.

Assume all it takes is a lowering of the highlight clipping point,
which essentially is the same as multiplying all luminance levels by a
fixed factor. That would work out differently for shadows/highlights
if the response was non-linear.

SNIP
I'm taking note, but I think I'll leave such checks for later when
the program is somewhat stable.

Obviously, I know how actual programming works (tackle large issues
first, and get a working alpha version before working on the 'icing of
the cake'), but just don't forget some obvious boundary checks in the
end.
Currently, I'm normalizing the ESF, the LSF and the MTF to between
0.0 and 1.0.

Just note that actual MTFs can exceed 1.0, assuming correct
normalization to 1.0 at zero cycles. Edge sharpening halo can achieve
that easily.

SNIP
The ISO suggests to [...] determine the centroid of the LSF (by
calculating the discrete derivative of the ESF). The centroids can
be used for regression.

The derivative suggested by the ISO is:
"for each line of pixels perpendicular to the edge, the edge is
differentiated using the discrete derivative "-0,5 ; +0,5", meaning
that the derivative value for pixel "X" is equal to -1/2 times the
value of the pixel immediately to the left, plus 1/2 times the
value of the pixel to the right".

Sorry if I'm thick, but mathematics isn't my best friend...

Hey, I also know my limitations in that field ... ;-)
You're implying that, for each line of pixels, the edge center(oid?)
will be the absolute maximum of the above derivative, aren't you?
Yep.

But isn't the absolute maximum of the derivative precisely the
maximum gradient?

Rereading it, yes, but it actually is where the increasing contrast
turns into decreasing contrast (the first derivative being the slope
of the curve).
(Though the formula I use is currently simpler than the one you
cite: simply y'=y[i+1]-y)


Yes, and it'll produce a difference, but actual nodes will on average
be displaced by half a pixel. Nevertheless, the sample code from
the ISO seems to do what you did, so I'd suggest leaving it that way.

SNIP
What kind of provisions?

With noisy images, there can be multiple LSF maxima from a single ESF.
One should decide which maximum to take. I dug up some old Word
document with C code for the SFR calculation. It takes the average
between the leftmost and rightmost maxima.
If your email address in your signature is valid, I can send you that
document.

SNIP
Even though the SourceForge description currently says little more
than "calculates the MTF from a slanted edge", ultimately I'd like
this program to do automatic deconvolution (or whatever is best) of
images based on the edge results.

Yes, that's a good goal, although it will take more than a single
slanted edge to get a two-dimensional a-symmetric PSF). What's worse,
the PSF can (and does) change throughout the image, but a symmetrical
PSF will already allow to improve image quality.
Some hurdles will need to be taken, but the goal is exactly what I am
looking for.

SNIP
My main resource has been
http://www.isprs.org/istanbul2004/comm1/papers/2.pdf

where I took the evil alternative to the "4x bins" that I'm
currently using, with all the regression nighmares it brings ;-)
But it was an interesting document, anyway.

Yes, we're not the only ones still looking for the holy grail, it
seems.

I'm working on a method that will produce a PSF, based on the ESF
derived from a slanted edge. That PSF can be used in various
deconvolution methods, and it can be used to create a High-pass filter
kernel. Could be useful to incorporate in the final program.

Bart
 
SNIP
The new archive at
http://ljl.741.com/slantededge-alpha2.tar.gz
or
http://ljl.150m.com/slantededge-alpha2.tar.gz
now includes a Windows executable, as well as a test image.

Thanks. Unfortunately I get an error box with:
"This application has failed to start because cygwin1.dll was not
found".

SNIP
At
http://ljl.741.com/comparison.gif
there is a a graph showing the MTF calculated both by my program and
SFRWin, from the test image included in the archive.

The test image looks a bit odd. It seems like the edge was resized
vertically with a nearest neighbor interpolation. The edge pixels look
like they are 2 pixels high and 1 pixel wide. The noise in the image
is single pixel noise (and has a bit of hardware calibration
striping).

It looks strange, and thus produces strange sharpening artifacts if
pushed to the limits (deconvolution sharpening with a derived PSF, and
a Richardson-Lucy restoration with the same PSF). Nevertheless, the
Imatest SFR analysis looks identical to the SFRWin result.

Bart
 
Bart said:
SNIP



Thanks. Unfortunately I get an error box with:
"This application has failed to start because cygwin1.dll was not found".

Yeah, it's the Unix emulation layer that Cygwin compiled programs
apparently need.
I've uploaded it at
http://ljl.741.com/cygwin1.dll.gz
http://ljl.150m.com/cygwin1.dll.gz

Putting it in the same directory as the program should work, as should
putting in in \Windows\System32.
SNIP



The test image looks a bit odd. It seems like the edge was resized
vertically with a nearest neighbor interpolation. The edge pixels look
like they are 2 pixels high and 1 pixel wide.

It could be: I've scanned some edges at 2400x4800 and resized them down
to see how this affected the MTF (remember the thread "Multi-sampling
and 2400x4800 scanners").

Obviously, I'm stupid enough to give very meaningful names to my files
such as "edge1.tif", "edge2.tif", "edgehg.tif", so it's entirely
possible that I took the wrong edge =)

I guess the next tarball will contain a freshly scanned edge, to avoid
this confusion.
The noise in the image is
single pixel noise (and has a bit of hardware calibration striping).

What is single pixel noise -- or, is it "single pixel" as opposed to what)?

by LjL
(e-mail address removed)
 
Back
Top