Best scanning manager program?

  • Thread starter Thread starter T. Wise
  • Start date Start date
Also see "alt.comp.periphs.cdr." This group, along with what I earlier
cited, has posts that are two to five years old. There may be a group
devoted to dvd's as suitable for image archiving. It's late & I'm not
following up on that. One primary point in many posts is: Your Original
Negatives, Slides, Transparencies, and (lacking the former) Prints are

Prints are not usually considered any where near as long lived as
negatives and slides. OTOH they usually receive far rougher treatment
than slides or negatives.
the only safeguard against time and technological change. In other
words, yes, scan & archive to digital those analog images, but for
god's sake, store those analog images you value highly as safely and
archivally as possible,
Amen!


for they will always be accessible to any

Welll.. in general, but those negatives and slides don't have any
gurantees either. I've have a number of Kodachrome and Ektachrome
slides I scanned and restored and the originals were on their last
legs. Most, but not all of these were well taken care of so the only
reason for color shifts and fading that I can think of would be the
quality of processing.

I do think it's safe to say *most* slides and negatives will be
available in a form we can use far longer than any current digital
with those few exceptions where the image is fading due to either poor
processing or abuse in storage.

Roger Halstead (K8RI & ARRL life member)
(N833R, S# CD-2 Worlds oldest Debonair)
www.rogerhalstead.com
 
Go to rec.photo.digital and use the term "archiving" to search.

I thought you were referring to a new news group. I have most of the
posts in the previous thread/discussion on RPD.

I also have one of the reference pages
http://www.rogerhalstead.com/scanning.htm
Some where around here I have an e-mail from a college that is using
this page in one of their photo courses.

It does need updating, but not a lot has changed.

Roger Halstead (K8RI & ARRL life member)
(N833R, S# CD-2 Worlds oldest Debonair)
www.rogerhalstead.com
 
Well, I'm not just applying noise randomly (otherwise the operation
wouldn't be reversible anyway): I *am* taking the data into account, by
taking the histogram into account.

Yes, but for the wrong reason. You're taking data into account for the
sole purpose of making the process reversible without concern what it
does to the image.
Why are you concerned with reversibility?

I'm not really so concerned with reversibility. In fact, I have no
problem believing that many *irreversible* transforms would show much
better results than my simplicistic algorithm.

It's just that what you said here...

--- CUT ---

[myself]
That's because this noise addition is lossless, and the "original" >
also has its histogram.

[you]
The problem is that's just physically impossible. There's only a fixed
amount of pixels in an image, and the histogram shows them all.

--- CUT --

... is not really true.

Whether this fact has any useful practical applications, I'll leave to
more skillful eyes to judge. But it's not impossible.

No, no... You misunderstood. It's not reversibility that's impossible.
It's the lossless part that's impossible.

What I was saying there is that to fill in the gaps you have to
"insert" pixels with colors absent in the histogram. And you can't do
that because all pixels are already "taken". The only way, therefore,
is to "borrow" pixels from somewhere else. And if you do that the
operation can never be lossless. You are redistributing pixels and
that, by definition, loses information.

If you look at your test images, every place where you insert a pixel
deletes the pixel which used to be there. Therefore, you can't remove
the gaps without "destroying" some of the existing pixels. That was my
point.
That's certainly a reasonable thing to do. Still, if one could find a
posterization-hiding algorithm that is *both* nice-looking and
reversible (it might not exist of course), there would be no need at all
to keep a copy.

Actually there would because removing pixelation is only the first
step. After that you still have to edit the image. And that editing
will be lossy. Even if you keep all the curves etc. it doesn't help.
For one, you can never fully recover the image by applying the reverse
curve due to rounding errors. But, more importantly, editing involves
making small changes all over the image and that's impossible to
reverse unless you record each action as a macro. And pretty soon all
them combined will probably end up larger than the original image.

While reversibility is attractive, it's not really practical for this
purpose. Archiving the image is much more meaningful.
Well, as long as that's the *only* transformation applied to the
original image... so I guess this is all more theoretical than anything.

Yes, that's exactly what I mean!

Don.
 
You've not answered my basic question, which is how do you "see" noise
or posterization or other flaws with "objective" (actually
quantifiable) tools? What do noise or scanner or IR artifiacts look
like in a histogram? What is the "ruler" for scan quality and how do
you use it?

I have answered it. Repeatedly.

Re-read the paragraph on "good" and "bad" histogram gaps.
If there is a resource that explains this I am open to reading about
it. Your previous posts that I've read don't go into any detail but
just suppose the existence of such tools.
My response has been the tools I have in Photoshop 7 do not tell me
much about these flaws. This leads me to conclude that the best form
of input is therefore visual data- i.e. observation. Observation in a
controlled setting is the starting point for the "scientific method"
which then leads you to form a hypothesis and challenges to the
observation through experiment.

The problem is not the tools but your basic misunderstanding of even
the most *elementary* concepts!

There's no point discussing tools when you don't understand the
results or their implications. We need go no further than the above
example. Histogram is a simple tool but you can't even grasp that!

As I said last time, please go back to the archives and re-read
everything, but *think*, don't just emotionally overreact.

There are also numerous resources on the net, again, posted in this
and other groups many times.
Don wrote:
"First of all, to do any meaningful testing you have to *disable* all
image processing!!!"

Can you explain why you would do this in a scanner driver when you will
be using the processed output for manipulation in Photoshop (why would
I do this- I don't work on gamma 1.0 raw scans in Photoshop). What are
the advantages of this and why is it more meaningful?

It's like asking: "Why do I have to remove my blindfold to see?".

Asking such questions shows how little you understand, Roger. This is
just common sense stuff now, nothing to do with the subject matter.

I've explained it all *many* times. Please *read* the thread.
Don wrote:
"But *read* and *think*, don't just emotionally react! It's all there.
Explained *multiple* times!"

Once again, insulting people by claiming they are emotional,
irrational, or illogical. Thanks again, I appreciate it. Your
"explanation" could not possibly be at fault.

It's not an insult. The fact that you think it is shows how little you
understand or bother to think things through. If it were an insult,
would I have patiently explained things over and over and over again?

But I can only do so much. You have to put in some work too.

Don.
 
Don said:
LjL wrote:

[snip]

No, no... You misunderstood. It's not reversibility that's impossible.
It's the lossless part that's impossible.

What I was saying there is that to fill in the gaps you have to
"insert" pixels with colors absent in the histogram. And you can't do
that because all pixels are already "taken". The only way, therefore,
is to "borrow" pixels from somewhere else. And if you do that the
operation can never be lossless. You are redistributing pixels and
that, by definition, loses information.

I think we're understanding what each other says, it's just a matter of
terminology.

I'm not so terribly strong in information theory, but as far as I
understand, "lossless" and "reversible" are synonyms in this context: a
"lossless transformation" is one where the original input can be exactly
reconstructed from the output; and that's the same as a "reversible
transformation".

Just think of how these terms are used about compression: LZW
compression is lossless, as it is reversible. JPEG compression is lossy,
as it is irreversible.
If you look at your test images, every place where you insert a pixel
deletes the pixel which used to be there. Therefore, you can't remove
the gaps without "destroying" some of the existing pixels. That was my
point.

Yes, I see your point.

But! The transformation is still lossless, for even if you apply LZW
compression (to continue the example above), you destroy pixels: you
even actually *remove* some of them (which is the whole purpose of
compressing).

It's just that your image viewer automatically applies the inverse
transformation to get the original input back to you.

It seems that, in the end, your definition of "lossless" is "don't touch
the image".

Think of it, even if you could somehow *add* pixels (so that your "all
pixels are taken" wouldn't hold true anymore), it still wouldn't be
lossless under your definition: how would you know which pixels are
"original" and which ones were added?


Also, you talk about "destroying some of the existing pixels".
But wait a moment: how could I add new data (that is, random noise)
while maintaining reversibility without taking up more bytes than the
original image?

The fact that I can shows that some of the stuff that was in the image
*wasn't information to begin with*.
In fact, the "existing pixels" in the original image conveyed much less
information than they would be able to.

Why? Precisely because of posterization. For each given pixel whose
position in the histogram is in the middle of a gap, you can't really
say what its real value would have been: for some reason, the image we
have is posterized, and thus one single pixel value can map to a *range*
of values in the original picture.

If you take each of these pixels and give it any random value *in that
range*, you aren't really taking away anything from the image, or
"corrupting" it.
That's because each of the pixels *really could* have had any of those
values (in the original picture), and we have no way to know which.

This is precisely what my program does.


Take a posterized image. Throw it to my program. Take a lossless
compression algorithm, and compress both the original image and the
"noised" one.
The original image will compress much better, and this is mostly because
it *did not* contain the information that my program "destroyed" --
which it didn't.
Actually there would because removing pixelation is only the first
step. After that you still have to edit the image. And that editing
will be lossy. Even if you keep all the curves etc. it doesn't help.
For one, you can never fully recover the image by applying the reverse
curve due to rounding errors.

See, now you're using what I take to be the correct definition of
"lossless"!

You say, "curves editing isn't lossless because applying the inverse
curve won't recover the image [because of rounding errors, given a small
enough number of bits per channel]".

This is correct.
On the other hand, it would obviously *not* be correct to say that
"curves editing isn't lossless because it changes the original pixel
values" -- duh, of course it does!
But this looks like precisely the objection you make to my rant about
adding noise.
But, more importantly, editing involves
making small changes all over the image and that's impossible to
reverse unless you record each action as a macro.

No, not more importanly, but much less importantly.

In practice, sure, this will be an important factor.
But in theory, it's implicit in the terms "lossless" and "reversible"
that you have to know the full algorithm (that is, in this case, the
various editing steps) that transformed the data.
And pretty soon all
them combined will probably end up larger than the original image.

Come on, no, not really for images larger than a hundred squared pixels
or so!
Unless Photoshop macros are *really* bloated. But even then, there are
other ways than Photoshop macros.
While reversibility is attractive, it's not really practical for this
purpose. Archiving the image is much more meaningful.

I'm sure that this is true in practice.
Yes, that's exactly what I mean!

Yes, agreed.

But it wasn't so much to suggest that "my method" could be used instead
of good old archival; it was more to convey the idea the my "noised"
images will contain the same information (and more, i.e. the noise, but
this doesn't matter to us) as the originals.


Last example: take an image that is all black on the left and all white
on the right, with no grays - again like your own example.
If you know that this is not a faithful reproduction of the original
picture, but rather a result of posterization/quantization, then you may
suppose that the original picture could have been a continuous shade of
gray... or something else, you can't really know.

So what happens if you add noise to the black/white image with my
program? My program will fill the black part with random values 0..127,
and the white part with values 128..255.

What have you lost in the process? Nothing, as you didn't really know
that the "black" part was black (instead of varying from 0 to about 127)
asnd that the "white" part was white (instead of varying from about 128
to 255).


by LjL
(e-mail address removed)
 
I'm not so terribly strong in information theory, but as far as I
understand, "lossless" and "reversible" are synonyms in this context: a
"lossless transformation" is one where the original input can be exactly
reconstructed from the output; and that's the same as a "reversible
transformation".

No, because "lossless" refers to the fact that no information is lost.
Therefore, there is nothing to "reverse".

What you're getting at is that if you have a *lossy* operation which
is reversible then the *end result* is lossless, but that's a stretch.
But! The transformation is still lossless, for even if you apply LZW
compression (to continue the example above), you destroy pixels: you
even actually *remove* some of them (which is the whole purpose of
compressing).

You don't destroy pixels when you apply LZW otherwise you would not be
able to get them back. You simply encode or compress them, but no
information is lost or destroyed.

You're focusing in to narrowly and literally on what bytes are
actually in the file but that's not important in this context. It's
the concept which is the key.
It seems that, in the end, your definition of "lossless" is "don't touch
the image".

No, it means no information is lost. It's all really semantics...
Think of it, even if you could somehow *add* pixels (so that your "all
pixels are taken" wouldn't hold true anymore), it still wouldn't be
lossless under your definition: how would you know which pixels are
"original" and which ones were added?

The problem is that's impossible because the number of pixels in an
image is fixed.
Also, you talk about "destroying some of the existing pixels".
But wait a moment: how could I add new data (that is, random noise)
while maintaining reversibility without taking up more bytes than the
original image?

You can't! That was exactly my point.
The fact that I can shows that some of the stuff that was in the image
*wasn't information to begin with*.
In fact, the "existing pixels" in the original image conveyed much less
information than they would be able to.

That's different. Now you're getting into what JPG or MP3 do i.e.
ranks information and removes marginal data. But that is lossy by
definition.
Why? Precisely because of posterization. For each given pixel whose
position in the histogram is in the middle of a gap, you can't really
say what its real value would have been: for some reason, the image we
have is posterized, and thus one single pixel value can map to a *range*
of values in the original picture.

As I mentioned before the histogram is a very valuable tool (and I
*am* a histogram worshipper!) but you also have to be careful how you
interpret it.

For one, the gaps are not automatically bad in regard to image quality
(only potentially so). If the posterization is not perceptible than
there is no point in worrying about them.

I know, we like things to be perfect so we are bothered by those gaps,
but the gaps have to be taken in context.

Now, if there is perceptible posterization, then again don't get
fixated by the gaps (or the histogram for that matter) but focus on
how best to remove this posterization. Given the right process the
histogram will fix itself. You use the histogram, of course, to check
the process but don't let the histogram drive your process if the
result of that is neglecting the goal.

One example of this is the application of random noise which, as you
yourself say, looks "terrible" but it "fixes" the histogram. Well,
that's a case of "throwing out the baby with the bath water" as we
say. Or "The operation was a success but the patient died"! ;o)
If you take each of these pixels and give it any random value *in that
range*, you aren't really taking away anything from the image, or
"corrupting" it.

The reason I said "corrupt" is because the data you are trying to
recover not only exists but you can obtain it easily (simply use
16-bit depth and then reduce).

However, you chose not to do that (for whatever reasons) and are now
trying to recover this missing data from an 8-bit image. That data
will never be as genuine as the original data in a 16-bit file.
Therefore, anything you do to the 8-bit file is "corruption" in regard
to the actual 16-bit data you're trying to recreate.
That's because each of the pixels *really could* have had any of those
values (in the original picture), and we have no way to know which.

But we do! Simply use 16-bit depth and reduce. That way you will get
exactly the data you are missing in the 8-bit domain.
No, not more importanly, but much less importantly.

"More importantly" in the sense of not being able to reverse it
easily.
In practice, sure, this will be an important factor.

In that sense too.
But in theory, it's implicit in the terms "lossless" and "reversible"
that you have to know the full algorithm (that is, in this case, the
various editing steps) that transformed the data.


Come on, no, not really for images larger than a hundred squared pixels
or so!

It depends on the amount of editing.

For example, if you can't use ICE (e.g. Kodachromes or B&W
silver-based film) and have to edit all dust and scratches manually,
plus remove all the "pepper spots" no method is likely to be more
efficient (not to mention simpler) than just simply archiving the
original image.
Last example: take an image that is all black on the left and all white
on the right, with no grays - again like your own example.
If you know that this is not a faithful reproduction of the original
picture, but rather a result of posterization/quantization, then you may
suppose that the original picture could have been a continuous shade of
gray... or something else, you can't really know.

Actually you do know that there is a threshold which made one side
black and the other side white. What you don't know is how solid each
side is and how wide the transition.
So what happens if you add noise to the black/white image with my
program? My program will fill the black part with random values 0..127,
and the white part with values 128..255.

What have you lost in the process? Nothing, as you didn't really know
that the "black" part was black (instead of varying from 0 to about 127)
asnd that the "white" part was white (instead of varying from about 128
to 255).

But you haven't gained anything either. You ended up adding "grain" to
an otherwise "clean" image.

The only meaningful thing that can be done is to apply a small amount
of Gaussian Blur to the border in order to smooth it out. Personally,
that's too much like "anti-aliasing" for my taste, and I would chose a
different approach.

In general, I would always focus on solving the root problem rather
than trying to fix the consequences afterwards. In my experience, when
you try to fix the consequence in only generates more problems than it
solves. Usually you then have to "fix the fix" and so on. It just
never ends...

Don.
 
Don said:
No, because "lossless" refers to the fact that no information is lost.
Therefore, there is nothing to "reverse".

The only way not to lose information (in the sense you use this term!)
would be not to change the input at all.
What you're getting at is that if you have a *lossy* operation which
is reversible then the *end result* is lossless, but that's a stretch.

A lossy operation can't be reversible.

Can you name a "lossless" operation (i.e. where "no information is
lost"), which doesn't need to be reversed to get back the input?

The only one I can think is the null operation.
You don't destroy pixels when you apply LZW otherwise you would not be
able to get them back.

Cool. So, I don't destroy pixels when I apply my noise addition,
otherwise I would not be able to get them back.

And I do get them back. Try and see! Where *is* the difference between
this and LZW? They both get back *exactly the same input image with no
changes at all*, so why is my noise addition "destroying data" while LZW
is not?
You simply encode or compress them, but no
information is lost or destroyed.

Nor is information destroyed with my noise addition.
You're focusing in to narrowly and literally on what bytes are
actually in the file but that's not important in this context. It's
the concept which is the key.
Uh?!


No, it means no information is lost. It's all really semantics...

Well, no information is lost.

You have a set A of data, and you transform it to a set B of data by
applying an operation O. If there exists an operation O' that can take B
as input and give back A, then O was lossless/reversible.

I still can't understand *where* you disagree with this concept.

(My noise addition operation is obviously only lossless when the
histogram of the original image "A" is considered part of the output
"B", but I've made that clear multiple times. This is the only caveat I
can see, though)
The problem is that's impossible because the number of pixels in an
image is fixed.

That's why I said "even if you could". What I wanted to say is that the
point is moot, since under your definition, it wouldn't be lossles *even
if* this could be done.
You can't! That was exactly my point.

But I can. Just run the program. You'll see that you'll get back your
original input every time, save bugs (i.e. "maintain reversibility"),
and yet, noise will have been added in the output.
That's different. Now you're getting into what JPG or MP3 do i.e.
ranks information and removes marginal data. But that is lossy by
definition.

Yes, it is lossy by definition.
But no, that's not what I'm doing.

I'm not removing marginal data; I'm removing *no* data.

Answer this please: if you taken an image.tif, then compress it to an
image.jpeg, then convert image.jpeg to image2.tif, will
image.tif=image2.tif ever hold true?

I bet it never will. Precisely because JPEG is lossy.

On the other hand, if you take an image.tif, apply noise with my program
and get an image_noise.tif, then use my program to remove the noise and
get an image_denoised.tif, it will be true that
image.tif=image_denoised.tif (though, again, you must feed my program
the histogram of image.tif as well).
[snip]

Now, if there is perceptible posterization, then again don't get
fixated by the gaps (or the histogram for that matter) but focus on
how best to remove this posterization. Given the right process the
histogram will fix itself. You use the histogram, of course, to check
the process but don't let the histogram drive your process if the
result of that is neglecting the goal.

This may be sound advice, but it's not the point.
I'm not trying to demonstrate that my program is the best way to remove
posterization, but only that it performs a lossless operation.
One example of this is the application of random noise which, as you
yourself say, looks "terrible" but it "fixes" the histogram. Well,
that's a case of "throwing out the baby with the bath water" as we
say. Or "The operation was a success but the patient died"! ;o)

I think the program can be improved to make the image look less
terrible, hopefully to a point where the result will look better than
the original posterization.

But, for the moment, I realize perfectly that the patient died; I still
think I can show that the operation was a success.
The reason I said "corrupt" is because the data you are trying to
recover not only exists but you can obtain it easily (simply use
16-bit depth and then reduce).

The data does not exist *in the input*! Obviously, I'm talking about the
input to my noise adder, that is the original scan.

If the scan was made in 8-bit, then the data does not exist in it.

If you scan at 16-bit, it's all different. And, granted, scanning at
16-bit is the best way to go in many cases.
However, you chose not to do that (for whatever reasons) and are now
trying to recover this missing data from an 8-bit image. That data
will never be as genuine as the original data in a 16-bit file.

Of course! but as you say, "for whatever reasons" we have an 8-bit file.
Clearly, all the arguments about reversibility and lossless operations
must be about *that* file.

Otherwise, it's like saying, "no, LZW isn't lossless because you could
have gotten a more genuine image using a better scanner [or scanning in
16-bit]". Half of this sentence is true, but the other half doesn't make
any sense. Guess which is which?
Therefore, anything you do to the 8-bit file is "corruption" in regard
to the actual 16-bit data you're trying to recreate.

"Recreate"? I've never claimed I can "recreate" useful 16-bit channels
from a single scan of 8-bit channels.

(That is, of course, unless the scanner *really* only outputs 8
meaningful bit; but you said that even a low-end scanner has meaningful
data beyond the 8th bit, and I'll take your word for it)

In regard to the actual 16-bit data, *the* 8-bit file itself is
corruption! But that wasn't the point of the current discussion, having
nothing to do with "information", "corruption" and "lossless/reversible
operations" relative to *the data you have*.
If you can get better data by sampling the target picture again, that's
another matter.
But we do! Simply use 16-bit depth and reduce. That way you will get
exactly the data you are missing in the 8-bit domain.

Sure. Please don't get the impression that I'm rejecting this possibility.
I currently scan at 8-bit mainly because of file size and bus speed, but
I'm absolutely not trying to claim that scanning at 16-bit is worthless.

The case is simply: you have a posterized image, and you can't or won't
scan it again at a higher bit depth. What can be said about such an
image? What reversible operations can be applied on it? Is artificial
noise perceptually better than posterization? etc.
[snip]
Last example: take an image that is all black on the left and all white
on the right, with no grays - again like your own example.
If you know that this is not a faithful reproduction of the original
picture, but rather a result of posterization/quantization, then you may
suppose that the original picture could have been a continuous shade of
gray... or something else, you can't really know.

Actually you do know that there is a threshold which made one side
black and the other side white. What you don't know is how solid each
side is and how wide the transition.

Yes. By "something else" I didn't mean "anything else"... but there is a
wide range of possibilities; all involve a transition, sure.
But you haven't gained anything either. You ended up adding "grain" to
an otherwise "clean" image.

Exactly! I've gained nothing, as far as information theory is concerned.
But, let me stress once again that I've also lost nothing.

So, what's the purpose in adding grain to a clean (but posterized) image?

Well, if a purpose there is, it is perceptual. I think the human eye
simply likes noise more than posterization; after all, the concept of
quantization is probably unused in our brain, while noise is very
present on the other hand.

So, if we have an excessively low-bit-depth image (which is essentially
what a posterized image is), I think our eyes prefers it displayed with
noise in the (otherwise empty) lower bits, than with the lower bits at zero.

This is essentially the whole point of my test program. If fills the
lower bits with random data, when these lower bits would otherwise be
consistenly at zero.

(In reality, it's a little more complicated than just "filling in the
lower bits", as the gaps in the histogram might not be uniformely
spaced, in which case the situation can't be precisely compared with a
low bit depth condition; but the overall concept keeps working).
[snip]

In general, I would always focus on solving the root problem rather
than trying to fix the consequences afterwards. In my experience, when
you try to fix the consequence in only generates more problems than it
solves. Usually you then have to "fix the fix" and so on. It just
never ends...

Amen!
But this doesn't always make it useless to discuss "fixing the
consequences". If you can, you solve the root problem, but if for some
reason you can't, it can be helpful.
You know, you can't just solve everything with "go take a better scan".
The software tricks are useful sometimes.

by LjL
(e-mail address removed)
 
Back
Top