ocr handwriting software

  • Thread starter Thread starter Cliff Kruger
  • Start date Start date
C

Cliff Kruger

I am in the market for a scanner and ocr translation software that can translate both handwritten and typewritten German into English. For example, I need to be able to place a handwritten letter written in German on a flatbed scanner, and then have it translated into English. I can read and write both languages so this is a tool to help me speed up this process as I have close to one hundred pages to translate. Finding good software that can recognize modern German in a 'handwritten' format may be a challenge as well.

I've looked at one or two ocr software products online, and they tend to support tens or hundreds of languages. In my case, I would prefer software that specializes in the German to English recognition and translation relationship only. I have not found such a product, but I'm hoping that a good one exists.

Any tips on actual scanners and ocr software products to check out are appreciated.
 
This is a multi-part message in MIME format.

Don't do that. Post in plain text only. Force Microsoft Outhouse to do
the right thing, or better yet, get a Real Newsreader.
I am in the market for a scanner and ocr translation software that can
translate both handwritten and typewritten German into English. For
example, I need to be able to place a handwritten letter written in
German on a flatbed scanner, and then have it translated into English.

Doesn't exist in the way you seem to want it. Handwriting recognition
is insanely difficult because every person's handwriting is different
and handwritten text is much more variable than typeset text. If you
had a large sample of documents that were all written by the same person
and had good contrast with low noise, you might be able to train an
appropriate OCR program to get a 70-80% accuracy rate. Maybe. If you
have documents written by a bunch of different people, or documents that
are of low quality, it'll be more work than retyping the damn things.
I can read and write both languages so this is a tool to help me speed
up this process as I have close to one hundred pages to translate.

Machine translation is in its infancy. "Osama bin Laden"->Babelfish(
English to German)->Babelfish(German to English)->"The loaded bin
Osama". It gets worse when you throw in idioms, colloquialisms, and
slang. If I were you, I'd forget about machine translation because
you'll have to go through the output by hand anyway and fix all the
errors--better and faster to do it right the first time.
 
I've looked at one or two ocr software products online, and they tend to =
support tens or hundreds of languages.

OCR doesnt do handwritten text, and OCR is not translation software.
OCR software will not do what you want to do.

The language extent of OCR is that it has lists of words in selected
languages, so that matching result words helps it decide character
accuracy, if it was likely rn or a m, or cl or a d. These word lists are
called dictionaries, but they are just word lists, there is no
understanding of the meaning of those words in OCR.
 
Dances With Crows said:
Doesn't exist in the way you seem to want it. Handwriting recognition
is insanely difficult because every person's handwriting is different
and handwritten text is much more variable than typeset text. If you
had a large sample of documents that were all written by the same person
and had good contrast with low noise,

.... and if they didn't have handwriting like my sister, which after 50+
years even I can only read half the time, relying heavily on guesswork and
context.

But seriously, eveyone is right --
Handwriting recognition basically doesn't exist in practical form for what
Cliff wants.
Translation software (once you have the source nicely OCR'ed into a text
file in the original language) exists but is problematic.

If the context is fairly narrow, such as plastics manufacturing invoices,
this might be more feasible.

Suggest OP post to comp.ai.doc-analysis.ocr
 
Back
Top