a few questions from a newbie

  • Thread starter Thread starter theKman
  • Start date Start date
T

theKman

Hi all,

I was wondering if its possible to do the following:

OCR an image file (e.g. jpg, pdf) with English text and many small
images & foreign letters (e.g. Chinese)? What will happen to the
document? Will it OCR the English text and leave the images/foreign
letts as they are in place, like the original document?

If this is possible, what software & hardware do I need?

Thanks in advance.


Kas
 
in message
Hi all,

I was wondering if its possible to do the following:

OCR an image file (e.g. jpg, pdf) with English text and many small
images & foreign letters (e.g. Chinese)? What will happen to the
document? Will it OCR the English text and leave the images/foreign
letts as they are in place, like the original document?

If this is possible, what software & hardware do I need?

Thanks in advance.


Kas
You can OCR a image file if the image is at around 300-600 DPI. OCR does not
work well below 300 DPI. Image types that typically can be read by OCR
programs are BMP, TIFF, JPG, PCX, PNG and GIF.

From OmniPage 12 help:
Portable Document Format (PDF): A document format widely used in web pages
and for displaying documents. OmniPage Pro can open PDF files and create an
editable version of their displayed texts. It can also save recognition
results to five variants of PDF files: viewable only, viewable with image
replacements of uncertain characters, viewable and searchable, viewable and
editable, and edited.

For the language, OCR usually does one language at a time. What cannot be
converted to text can be saved as a graphic. Yes, they will be saved in
place.

For equipment, any scanner (flatbed) that can scan the document at 300-600
DPI in color, grayscale or Black and White (bitmap) will work fine.

The software for OCR varies from poor to very good.
Some of the best OCR is OmniPage 14 and Abbyy Fine Reader.

OmniPage:
http://www.scansoft.com/omnipage/

Abbyy Fine Reader:
http://www.abbyy.com/

The best price for OmniPage is found here:
http://www.scantips.com

Along with the best information for scanning on the net.
 
in message
Hi all,

I was wondering if its possible to do the following:

OCR an image file (e.g. jpg, pdf) with English text and many small
images & foreign letters (e.g. Chinese)? What will happen to the
document? Will it OCR the English text and leave the images/foreign
letts as they are in place, like the original document?

If this is possible, what software & hardware do I need?

Thanks in advance.


Kas
You can OCR a image file if the image is at around 300-600 DPI. OCR does not
work well below 300 DPI. Image types that typically can be read by OCR
programs are BMP, TIFF, JPG, PCX, PNG and GIF.

From OmniPage 12 help:
Portable Document Format (PDF): A document format widely used in web pages
and for displaying documents. OmniPage Pro can open PDF files and create an
editable version of their displayed texts. It can also save recognition
results to five variants of PDF files: viewable only, viewable with image
replacements of uncertain characters, viewable and searchable, viewable and
editable, and edited.

For the language, OCR usually does one language at a time. What cannot be
converted to text can be saved as a graphic. Yes, they will be saved in
place.

For equipment, any scanner (flatbed) that can scan the document at 300-600
DPI in color, grayscale or Black and White (bitmap) will work fine.

The software for OCR varies from poor to very good.
Some of the best OCR is OmniPage 14 and Abbyy Fine Reader.

OmniPage:
http://www.scansoft.com/omnipage/

Abbyy Fine Reader:
http://www.abbyy.com/

The best price for OmniPage is found here:
http://www.scantips.com

Along with the best information for scanning on the net.

Just a small point.... the "fully automatic" mode of the OCR software
may not work very well, but if you spend a bit of time to designate
which blocks are text and which are graphics, it will indeed do a fine
job.
Charlie Hoffpauir
http://freepages.genealogy.rootsweb.com/~charlieh/
 
Back
Top