line-mode like scanning for e.g. 4 colors

  • Thread starter Thread starter Jakob Kellner
  • Start date Start date
J

Jakob Kellner

Hi!

I use a Fujitsu Fi 4220C2 docment scanner and I am generally very happy
with it. I mainly scan handwritten pages or handwritten corrections on
printouts. Lineart is very nice/fast, and tiff g4 results in small
(100KByte/page) and clear images.

Problem: The handwritten corrections on the black printouts are usually
red,
and it would be nice if they could be scanned red as well (then the
marks are
easier to spot). However, I am not happy with a standard color scan for
three
reasons:

1. lineart mode seems to give better ("sharper") results than color or
greyscale.
2. tiff g4 is small and has good quality (lossless?). If I use e.g. jpg
I get blurred images, if I do not compress the images are
ridiculously
huge.
3. the scan is much slower.

I suppose there is nothing I can do about 3, but I want to ask:

1. Is there a software/a commonly used algorithm (preferrably running
on
GNU/Linux, e.g. gimp, convert or nconvert) that reduces the colors to
e.g. 4 or
8 (but _without_ "simulating shades of colors" by mixing colors. It
should just
pick a palette of 4 or 8 colors including black and white and assign to
every
pixel the most appropriate color. So once again: if the palette
consists of
red, green, black and white, and the part of the image is dark red,
then it
should not result in a red/black pattern that tries to simulate dark
red, but
instead in either plain red or plain black (according to darkness).
Sorry for
the clumsy formulation). In the ideal case this program would for 2
colors
transform a greyscale scan exactly into the corresponding lineart scan.
(Or
does the scanner in lineart mode optimize the data in a way that simply
cannot
be reconstructed any more from color or greyscale data?)

2. which format would be best for this purpose? (something like tiff g4
with 8 colors would be perfect, it would also be nice if it is
widespread
so that there is hope it can still be used in 30 years)

Thanks for the help,
Jakob
 
In general, indexed color tools work like this:

1. you choose a color palette of 16 or 256 colors. This choice might
be a set of "standard" (equally spaced) colors (the same standard color
palette is used for any/every image, regardless of image content).
Or the choice might be a "Associative" palette, specific custom colors
automatically selected to closely match the colors in your specific
image. For example, if your specialized image was all entirely only a
gradient of red, perhaps the entire image only contains shades of red or
pink or white. An associative 16 color palette will be automatically
constructed to contain the 16 most representive colors in your specific
image, all red or pink or white in this case, and no other colors in
the palette. But any standard palette will contain some red, some green
and some blue colors too, suitable for any general image, but not really
best for any.

So in this example case (the red gradient), the associate palette has
maybe 12 shades of red, plus some white. But the standard 16 color
palette may have say 4 shades of red (and 4 of green and 4 of blue and 4
for white/black, etc). The standard palette is poor in this case (the
red gradient), but generally better for a random general image
containing more colors (still unknown to us at this point).

This standard/associative choice is commonly offered in editor programs.

2. You also choose a method to represent colors not included in the
selected palette. One choice is "dithering", which combines randomized
dots of a couple existing palette colors to attempt to better simulate a
color not actually in this limited palette (16 or 256 colors). Or maybe
you select a choice of "no dithering" (also often called Nearest Color),
where the nearest one color actually in the palette is used. This
nearest color may not be so very close or accurate, but it wont have the
specked dots of dithering, which is often more important than color
accuracy in many cases, esp graphics.

This "dithering or not" choice is commonly offered in editor programs

I cant speak for Gimp, but its manual does not describe much in the way
of choice for indexed color. It does say it uses dithering for colors
not in the palette, without mentioning any choices to turn it off.

Your limited case of only white, black and red probably can use a
standard or associative palette without great difference (any will have
white and black and red), but you are saying that you dont want to use
dithering (and I agree with you).

Your actual scan probably has appearance of some darker red and lighter
red, and some darker black and less dark gray, and some pure white and
some dirtier white, and you really dont want to use dithering dots to
simulate those lighter shades. You'd surely prefer to use the nearest
color in the palette, without dithering. Much cleaner appearance, even
if the actual shade of red is a bit off in color accuracy (who will ever
know later?)

These are common choices, so it is just a matter of continuing to look
for a editor software that supports what you want.. the above choices
are very common (in programs with much support for indexed color).
Windows programs like Elements or Paint Shop Pro are pretty good in this
respect.

File size:

Line art is 1 bit color, which is 8 image pixels per byte.
16 color indexed is 4 bits per pixel (4 times larger than line art)
It just is, that is how large the data is.
256 color indexed is 8 bits per pixel (8 times larger than line art)
24 bit RGB is 24 bits per pixel (24 times larger than line art)

File compression:

G4 is extremely efficient (G4 file is typically smaller than 1/10 size
of uncompressed data), but G3 and G4 is only used for line art.

Indexed color will typically use LZW compression, which is very good,
but less efficient. TIF with LZW is appropriate for indexed color, or
GIF automatically uses LZW too. LZW wont be anywhere near the 1/10 size
of G4, but it will be a lot smaller than 100% size.
 
In general, indexed color tools work like this [...]
File compression: [...]

Thanks a lot for the very detailed answer!

(Btw, gimp actually has a "no dithering" option, as Joal Heagney in the
comp.graphics.apps.gimp kindly pointed out. So I am going to experiment
with that.)
 
Jakob said:
Hi!

I use a Fujitsu Fi 4220C2 docment scanner and I am generally very happy
with it. I mainly scan handwritten pages or handwritten corrections on
printouts. Lineart is very nice/fast, and tiff g4 results in small
(100KByte/page) and clear images.

Problem: The handwritten corrections on the black printouts are usually
red,
and it would be nice if they could be scanned red as well (then the
marks are
easier to spot). However, I am not happy with a standard color scan for
three
reasons:

1. lineart mode seems to give better ("sharper") results than color or
greyscale.
2. tiff g4 is small and has good quality (lossless?). If I use e.g. jpg
I get blurred images, if I do not compress the images are
ridiculously
huge.
3. the scan is much slower.

I suppose there is nothing I can do about 3, but I want to ask:

1. Is there a software/a commonly used algorithm (preferrably running
on
GNU/Linux, e.g. gimp, convert or nconvert) that reduces the colors to
e.g. 4 or
8 (but _without_ "simulating shades of colors" by mixing colors.

Its a while since I used it, and I don't know if it runs under Linux,
but Paintshop Pro had a colour reducing algorithm built in which
supports a 4 colour palette (as well as 2, 16 and 256). The option
includes a "closest colour" or "dither" selection, with the algorithm
building the colours in the palette from the colours used in the image.

Might be worth a look.
 
Jakob said:
1. Is there a software/a commonly used algorithm (preferrably running on
GNU/Linux, e.g. gimp, convert or nconvert) that reduces the colors to
e.g. 4 or 8 (but _without_ "simulating shades of colors" by mixing colors.

See "pnmremap" in the netpbm toolkit.

John
 
Back
Top