Scanner dedicated to OCR work?

  • Thread starter Thread starter phuile
  • Start date Start date
P

phuile

I need a small scanner dedicated to OCR work. Can someone recommend
something? I know I will need an app too, so if anyone knows any good
one, please help.

I need to do a *lot* of OCR so a fast one will help, and any
additional helpful hints will also be welcome!

Thanks.
 
phuile said:
I need a small scanner dedicated to OCR work. Can someone recommend
something? I know I will need an app too, so if anyone knows any good
one, please help.

I don't think it matters much which scanner you use as long as you have
the right software. You don't want the text rotated significantly, but I
think OCR software can handle small rotations resulting from less than
perfect positioning of the document. Also, scanning software may also
have some straightening capability built in---as that for my Canon
D-2050C---and probably can produce output such as .pnm readable by a
typical OCR program. Also, many viewers or photo editing programs can
convert from jpeg of tiff to an appropriate format if your OCR program
can't read those direc tly.

I usually work under Linux, so I just downloaded the program gocr, which
is an open software optical character recognition program. I took a pdf
file I had recently obtained by scanning, converted it to .pnm format
using gimp, and I then applied gocr to the result. Although my original
was slightly rotated, the program didn't seem to have trouble producing
a text file. Its accuracy was far from perfect, but I did get a file I
could correct manually. I suspect that if I had scanned at a higher
resolution and started with a better original, or produced the .pnm file
directly from the scanner, I would have gotten better results. Also,
other open software ocr programs or commercial ocr programs designed for
Windows might work much better.

By the way, i really like my Canon-2050C. It has a very small
footprint, and it is more than fast enough for my purposes. I've
scanned many fair sized documents to pdf format with it, including a few
books I cut up for the purpose. Its major drawback for me is that
Vuescan, my standard scanning software program, doesn't support it under
Linux, so I have to reboot in Windows to use it.
 
Back
Top