how to scan + convert to text+pictures (bitmap+greyscale)

  • Thread starter Thread starter Gerd
  • Start date Start date
G

Gerd

I'm looking for a solution for the following problem : I want to scan
big documents that contain text and some pictures and save the result
as one .pdf-file.

Till now I used to scan all with 600 dpi in bitmap mode, save each
page as one .tif-file and afterwards print all .tif-files to Acrobat
Distiller. As a result I got one relatively small .pdf-file with only
30 to 50 kB per scanned page. Scanning in greyscale mode with a lower
resolution leads to much bigger .pdf-files with no higher quality as
long as the document contains only text.

The problem hereby is, that the quality of pictures in the document is
very low, very often they are to dark, but if I increase brightness
during the scan process, the quality of the text suffers.

There are scan programs, like Precision Pro from HP, that can
distinguish between text and pictures during the scan process. They
can save the scanned pages directly to a .pdf-file and render text
areas as bitmap with high resolution and pictures as greyscale with
low resolution during one scan process (and they can do OCR, too).
Unfortunately I don't have such a functionality in the driver for my
old high speed Fujitsu Scanner.

All I could do is scan all pages completely as greyscale. But then I
need a software, that can convert the produced .tif-files to a
..pdf-file, while it automatically converts text areas to high
resolution bitmap and picture areas to low resolution greyscale
(otherwise the produced .pdf-file would be huge). I tried Finereader 6
and Omnipage Pro 10 (althougth I dont need OCR), but both tried to
recognize text, that is part of the pictures, and thus destroyed the
pictures. To prevent them from doing this, I would have to select the
picture areas manually, that's to much work for big documents.

Thus I'm looking for a software, that can distinguish reliably between
text and (obviously darker) picture areas and convert both to one
..pdf-file.

Any ideas ? Thank you !

Greetings
Chris
 
Gerd said:
I'm looking for a solution for the following problem : I want to scan
big documents that contain text and some pictures and save the result
as one .pdf-file.

Till now I used to scan all with 600 dpi in bitmap mode, save each
page as one .tif-file and afterwards print all .tif-files to Acrobat
Distiller. As a result I got one relatively small .pdf-file with only
30 to 50 kB per scanned page. Scanning in greyscale mode with a lower
resolution leads to much bigger .pdf-files with no higher quality as
long as the document contains only text.

The problem hereby is, that the quality of pictures in the document is
very low, very often they are to dark, but if I increase brightness
during the scan process, the quality of the text suffers.

There are scan programs, like Precision Pro from HP, that can
distinguish between text and pictures during the scan process. They
can save the scanned pages directly to a .pdf-file and render text
areas as bitmap with high resolution and pictures as greyscale with
low resolution during one scan process (and they can do OCR, too).
Unfortunately I don't have such a functionality in the driver for my
old high speed Fujitsu Scanner.

All I could do is scan all pages completely as greyscale. But then I
need a software, that can convert the produced .tif-files to a
.pdf-file, while it automatically converts text areas to high
resolution bitmap and picture areas to low resolution greyscale
(otherwise the produced .pdf-file would be huge). I tried Finereader 6
and Omnipage Pro 10 (althougth I dont need OCR), but both tried to
recognize text, that is part of the pictures, and thus destroyed the
pictures. To prevent them from doing this, I would have to select the
picture areas manually, that's to much work for big documents.

Thus I'm looking for a software, that can distinguish reliably between
text and (obviously darker) picture areas and convert both to one
.pdf-file.

Any ideas ? Thank you !

Greetings
Chris

I don't think you will find any short cuts. You are going to have to "bite
the bullet" and scan the picture as a separate image and recombine later or
use Omnipage.

You have to use the manual selection of graphics for pictures and Text for
the OCR'd text.

Omnipage Pro 14 is improved, but I do not know how good the automatic
selection process works.

Omnipage Pro 14
http://www.scansoft.com/omnipage/
The best price for OCR software is found at:
http://www.scantips.com/

ABBYY PDF Transformer or ABBYY Fine Reader.
http://www.abbyy.com/
 
Back
Top