Which OCR package for text scanning is the best?

  • Thread starter Thread starter McGrath
  • Start date Start date
M

McGrath

I'm looking for a good OCR scanning package coming with scanner
or stand alone. I'd appreciate any recommendations of hint which
one should I choose. I'm familiar whit Omipage 14 basic OCR but
someone said that Readiris is better.
TIA for any info.
 
I'm looking for a good OCR scanning package coming with scanner
or stand alone. I'd appreciate any recommendations of hint which
one should I choose. I'm familiar whit Omipage 14 basic OCR but
someone said that Readiris is better.
TIA for any info.

Omnipage basic is just not that good. I had Omnipage 12 (the full ver)
and later upgraded to Omnipage 14. I decided not to upgrade that
version since it seems to do a good job with whatever I throw at it.
I've also heard that ABBY Finereader is very good, but I don't see a
need to own more than one, so I haven't tried it. But I only use OCR
occasionally.
 
Charlie Hoffpauir said:
Omnipage basic is just not that good. I had Omnipage 12 (the full ver)
and later upgraded to Omnipage 14. I decided not to upgrade that
version since it seems to do a good job with whatever I throw at it.
I've also heard that ABBY Finereader is very good, but I don't see a
need to own more than one, so I haven't tried it. But I only use OCR
occasionally.

Thank you for sharing your experience.
I agree as the OmniPage is concerned - it's not good at all.
I'll search for any information on Readiris. I used the very
early version of it , year ago and it was then quite good and
accurate.

Thanks !
 
I've used both OmniPage and ABBY. ABBY is simpler and has done
everything I've thrown at it so far.

I also like the Nuance PDF app which will convert scans to Word, etc.
 
I have used Omnipage Pro and Abbyy's Finereader, and neither one
of them are worth a damn. Maybe it's because I rarely use either one
of them, and I'm doing something wrong, but every time I have tried to
scan a document, it never scans and recognizes it properly. I have to
correct so many errors in the text, that I might as well have typed it
myself.
If it has charts or pictures with the text, forget it, I end up
with a total mess. I finally just gave up on OCR software, and no
longer use it.
Oh, and it's not the scanner either, since I have used both OCR
programs with three different scanners over the years....my latest
scanner is a Canon 9950F.

Talker


Hello, Talker.

I have used Omnipage Pro 15 with good success. Good copy and scan grey
scale 300 dpi works best for text. (I would only use the Pro version,
the bundled version is not up to par).

Forget about scanning in Black&White, getting the threshold right is
more trouble than it is worth.


If you have pictures in the copy, that is a different story. Although it
is not that hard to do. You set your zones in Omnipage to tell Omnipage
what is picture and what is text.

No OCR is 100% accurate. 90% is doable.
 
As to ABBY vs. Omni Page: I have used both and prefer ABBY. Why? ABBY is
simpler to use and is accurate enough for me. When I import into
MS-Word, misspelled words are underlined and I can fix them.
 
[snip]
No OCR is 100% accurate. 90% is doable.

Just curious what exactly you mean by 90% ?

1) 90% of the pages are flawless? (and 10% contain one or more mistakes?)
2) 90% of each line is correct? (and 10% (appr. 5 to 8 characters) PER LINE
are wrong?)
3) something different?

The first option might be workable, but the second for sure is not.

Cheers,

Edward
 
Talker said:
I have used Omnipage Pro and Abbyy's Finereader, and neither one
of them are worth a damn. Maybe it's because I rarely use either one
of them, and I'm doing something wrong, but every time I have tried to
scan a document, it never scans and recognizes it properly. I have to
correct so many errors in the text, that I might as well have typed it
myself.
If it has charts or pictures with the text, forget it, I end up
with a total mess. I finally just gave up on OCR software, and no
longer use it.
Oh, and it's not the scanner either, since I have used both OCR
programs with three different scanners over the years....my latest
scanner is a Canon 9950F.

Talker

Does Canon 9950F have Windows 7 Drivers ready?
And does any OCR come with Canon 9950F ?
 
[snip]
No OCR is 100% accurate. 90% is doable.

Just curious what exactly you mean by 90% ?

1) 90% of the pages are flawless? (and 10% contain one or more
mistakes?) 2) 90% of each line is correct? (and 10% (appr. 5 to 8
characters) PER LINE are wrong?)
3) something different?

The first option might be workable, but the second for sure is not.

Cheers,

Edward

I mean that at least 90% of the text will be correct. 10% or less will
require spell checking and some words may be completely garbled.

That is based on a One page OCR conversion of mixed pictures and text.

Works best if you do a manual zoning, to be sure you have the text zoned
and the graphics zoned.

PDF Image on text works pretty good, you get the image that is the exact
copy of your sheet, and the OCR result for searching. You only miss the
words that don't OCR well.

I have seen a 100% OCR result on really clear original copy of just
text.
 
Interesting ... no, I have found no spelling errors in the dictionary.
How would you correct them if you knew an obvious error or accidentally
added a misspelled word to the dictionary?
 
Interesting ... no, I have found no spelling errors in the dictionary.
How would you correct them if you knew an obvious error or
accidentally added a misspelled word to the dictionary?

MS word has a custom Dictionary which you can edit in MS word.
It is found at Tools > Options, on the Spelling & Grammer tab.
Click the Dictionaries button to get to Edit.
 
(e-mail address removed) wrote in
[snip]
No OCR is 100% accurate. 90% is doable.

Just curious what exactly you mean by 90% ?

1) 90% of the pages are flawless? (and 10% contain one or more
mistakes?) 2) 90% of each line is correct? (and 10% (appr. 5 to 8
characters) PER LINE are wrong?)
3) something different?

The first option might be workable, but the second for sure is not.

Cheers,

Edward

Different people define OCR accuracy in different ways. I prefer to go
by characters misrecognised.

I am amazed that anyone quotes 90%. They must have a truly useless
scanner working on something they found in a muddy puddle.

I used 90% because OCR is not 99.9% accurate 100% of the time.
90% is doable all the time. 99.9% is often true.

My scanner is a Canon Canoscan 8400F, it is a very good scanner.
It did not come in a muddy puddle.

If you have used Omnipage Pro 12, you know what I mean about the 90%
accuracy.

Omnipage Pro 15 is far more accurate than Omnipage Pro 12. The current
Omnipage is now Pro 17.
 
To digress ...

One of my real-life misspell examples is the word "Cancelled" which
you will see spelled "Canceled" with only one L. Most often found when
we have bad weather and TV shows flights that have been "Cancelled" by
one airline and "Canceled" by another. Frankly, I'm confused and should
consult my Merriam-Webster to get an answer, but the suspense is more
fun than knowing which is correct.
 
To digress ...

One of my real-life misspell examples is the word "Cancelled"
which
you will see spelled "Canceled" with only one L. Most often found when
we have bad weather and TV shows flights that have been "Cancelled" by
one airline and "Canceled" by another. Frankly, I'm confused and
should consult my Merriam-Webster to get an answer, but the suspense
is more fun than knowing which is correct.

Acording to a test of the two spellings in Microsoft Word 2000, both
spellings are correct.

Neither word is flagged as mis-spelled.

However my American Heritage dictionary show that "canceled" is the
correct spelling.

My Random House College Dictionary shows both spelling as correct.

So I guess it is a matter of choice.
 
Acording to a test of the two spellings in Microsoft Word 2000, both
spellings are correct.

Neither word is flagged as mis-spelled.

However my American Heritage dictionary show that "canceled" is the
correct spelling.

My Random House College Dictionary shows both spelling as correct.

So I guess it is a matter of choice.
Gee whiz! Is nothing sacred anymore? Two correct spellings?

Thanks.
 
CSM1 said:
Acording to a test of the two spellings in Microsoft Word 2000, both
spellings are correct.

Neither word is flagged as mis-spelled.

However my American Heritage dictionary show that "canceled" is the
correct spelling.

My Random House College Dictionary shows both spelling as correct.

So I guess it is a matter of choice.

Or which side of the Atlantic you are based.

How do you spell Jewellery / Jewelry / Jewelery?

The first is standard UK English spelling, but I've seen both the others
used around the place...

Mike
--
Michael J Davis

<><
"Just the place for a Snark!" the Bellman cried,
As he landed his crew with care;
Supporting each man on the top of the tide
By a finger entwined in his hair.

"Just the place for a Snark! I have said it twice:
That alone should encourage the crew.
Just the place for a Snark! I have said it thrice:
What I tell you three times is true."
<><
 
I don't know if there are drivers for the 9950F for Windows 7. I
am using Windows XP.
The 9950F came with Abbyy's Finereader OCR, but like I mentioned
above, it's pretty much worthless, as is Omnipage Pro.
I have used both of these OCRs over the years, and I admit that I
used them very infrequently, but when I did, the results were so bad
that they were useless. When the page had a picture or chart on it,
the OCR software gave a result that was unrecognizable. The picture
or charts were incomprehensible, and the text had so many errors, that
it wasn't worth it to correct them. It was quicker to just type the
text myself.
I'd like to sit with someone and see how any brand of OCR works
for them, then show them how it works for me, then compare notes.
I got so disgusted with them that I uninstalled them from my
computer. I don't even know where the install CDs are anymore, and I
don't care. They were a total waste of money.

Talker

Talker,

I find a LOT of difference depending on how I use the program (mostly,
I'm using Omnipage Pro v 14, but I've used various versions over the
last 10 years). If you just use the "default" setting, results depend
a lot on what you throw at it. A scanned typewritten page, for
example, will OCR with very few errors.... usually none. Scanned
newsprint can also do well, if it's not old and wrinkled. If there are
any images included, however, you need to depart from the default
setting and actually define the zones that are text and the zones that
are images. If you take the time to do that, resluts can be very good.

In other words, OCR is not an automatic process. It takes a lot of
human interface to get a good job.

I'd be willing to do a few tests with you. Just send me a scanned page
and I'll run it through the OCR process, and let you see how it comes
out. If interested, send the image file to charliehoffp at yahoo dot
com. Make the file an uncompressed tif, scanned at 300 ppi. Grayscale
probably best.... I'll adjust that if needed and send a copy of the
adjusted image back to you.
 
Or which side of the Atlantic you are based.

How do you spell Jewellery / Jewelry / Jewelery?

The first is standard UK English spelling, but I've seen both the others
used around the place...

Mike

I am in Texas, USA.

American Heritage Talking Dictionary shows:

jewelry
Ornaments, such as bracelets, necklaces, or rings, made of precious
metals set with gems or imitation gems.

---------------------------------------------------------
Excerpted from American Heritage Talking Dictionary
Copyright © 1997 The Learning Company, Inc. All Rights Reserved.

jeweler and jeweller
One that makes, repairs, or deals in jewelry
 
Back
Top