Tiff for OmniPage?

  • Thread starter Thread starter Ed Kearns
  • Start date Start date
E

Ed Kearns

I'm using OmniPage 5.0 on a G3 Mac, running OS 9.2.1. I'm trying to
understand the OmniPage requirements for TIFF files, as sometimes I convert
e.g. JPEGs to TIFFs to be read by OmniPage, and sometimes it works,
sometimes it doesn't. I know it has to be a B/W file (not color), but I've
seen some TIFF files are labeled TIFF(R), some are TIFF(B), some are TIFF.
Does anyone here know OmniPage's specific requirements?
Ed
 
I'm using OmniPage 5.0 on a G3 Mac, running OS 9.2.1. I'm trying to
understand the OmniPage requirements for TIFF files, as sometimes I
convert e.g. JPEGs to TIFFs to be read by OmniPage, and sometimes it
works, sometimes it doesn't.

What are the differences between the files when it works and when it
doesn't? What are you using to do the JPEG->TIFF conversion?
I know it has to be a B/W file (not color), but I've seen some TIFF
files are labeled TIFF(R), some are TIFF(B), some are TIFF. Does
anyone here know OmniPage's specific requirements?

That's an old version of Omnipage, but all versions of Omnipage should
be able to handle black-and-white TIFFs that are either uncompressed or
use Group4 compression (PackBits may not work) and are not more than 600
DPI. I don't know what MacOS means by "TIFF(R), TIFF(B), TIFF". You
might be able to find out if you have access to a 'Doze or Unix machine;
the tiffinfo utility is standard on Unix and has been ported to 'Doze.
Run tiffinfo on a TIFF(R) and a TIFF(B) file, compare the output. Don't
worry about transferring the files; TIFF shouldn't have a resource fork.
 
All of the versions I've used could handle 8-bit TIFF (but not 16-bit) gray
scale uncompressed. As I recall, some of the early ones could not handle
compressed TIFF, but I believe the later ones do handle at least LZW.

I'm not sure if they can handle 1 bit per pixel (B&W) files or not, but I do
know that the error rate becomes very high with a true B&W file, even if
stored as 8 bits per pixel.

Don
 
(e-mail address removed)202.dyndns.org, Dances
With Crows at (e-mail address removed) wrote on 5/26/04 9:24 AM:
What are the differences between the files when it works and when it
doesn't? What are you using to do the JPEG->TIFF conversion?

It usually says it can't open it. I've used GraphicsConverter.
That's an old version of Omnipage, but all versions of Omnipage should
be able to handle black-and-white TIFFs that are either uncompressed or
use Group4 compression (PackBits may not work) and are not more than 600
DPI. I don't know what MacOS means by "TIFF(R), TIFF(B), TIFF". You
might be able to find out if you have access to a 'Doze or Unix machine;
the tiffinfo utility is standard on Unix and has been ported to 'Doze.
Run tiffinfo on a TIFF(R) and a TIFF(B) file, compare the output. Don't
worry about transferring the files; TIFF shouldn't have a resource fork.
Thanks to your suggestion, I downloaded TiffTagViewer, a Doze application
that I can run in an emulator (attempts to download tiffinfo didn't work),
and got the tags on a file which works and one which doesn't. A lot of info,
not obvious where difference lies.
 
With Crows at (e-mail address removed) wrote on 5/26/04 9:24 AM:
It usually says it can't open it. I've used GraphicsConverter.

"Can't open"? MacOS programs rarely if ever report good diagnostic info
when something fails, but that error message is pretty lame even by
MacOS standards. Graphic Converter should work OK for this if you have
the settings right, but it's been ages since I last used that program
and I can't remember what the right settings might be.
Thanks to your suggestion, I downloaded TiffTagViewer, a Doze
application that I can run in an emulator (attempts to download
tiffinfo didn't work), and got the tags on a file which works and one
which doesn't. A lot of info, not obvious where difference lies.

Post the output of that program on a TIFF that works and a TIFF that
doesn't work. There should be fewer than 20 lines of text in both
outputs; that's hardly a lot of info.

--
Matt G|There is no Darkness in eternity/But only Light too dim for us to see
"We should have a policy against using personal resources for company
business." "The Company didn't pay for these pants, so I'm taking them
off at the door!" --J. Moore and A. DeBoer, the Monastery
Hire me! http://crow202.dyndns.org/~mhgraham/resume/
 
(e-mail address removed)202.dyndns.org, Dances
With Crows at (e-mail address removed) wrote on 5/26/04 1:09 PM:
"Can't open"? MacOS programs rarely if ever report good diagnostic info
when something fails, but that error message is pretty lame even by
MacOS standards. Graphic Converter should work OK for this if you have
the settings right, but it's been ages since I last used that program
and I can't remember what the right settings might be.


Post the output of that program on a TIFF that works and a TIFF that
doesn't work. There should be fewer than 20 lines of text in both
outputs; that's hardly a lot of info.
Here's the output, one good, one bad:
evaluated for tiff tags by Windows program TiffTagViewer

Testtext.TIFF (text scanned by scanner): ok for OmniPage Pro 5.0

SubFileType (1 Long): Zero
ImageWidth (1 Short): 1176
ImageLength (1 Short): 2681
XResolution (1 Rational): 300
YResolution (1 Rational): 300
BitsPerSample (1 Short): 1
SamplesPerPixel (1 Short): 1
Compression (1 Short): Uncompressed
Photometric (1 Short): MinIsWhite
RowsPerStrip (1 Short): 2681
StripByteCounts (1 Long): 394107
ResolutionUnit (1 Short): Inch
Orientation (1 Short): TopLeft
PlanarConfig (1 Short): Contig
StripOffsets (1 Long): 378


100_0085bw.tiff: Not ok for OmniPage Pro 5.0
SubFileType (1 Long): Zero
ImageWidth (1 Long): 8467
ImageLength (1 Long): 5642
BitsPerSample (1 Short): 1
Compression (1 Short): Uncompressed
Photometric (1 Short): MinIsWhite
StripOffsets (1 Long): 186
SamplesPerPixel (1 Short): 1
RowsPerStrip (1 Long): 5642
StripByteCounts (1 Long): 5974878
XResolution (1 Rational): 300
YResolution (1 Rational): 300
ResolutionUnit (1 Short): Inch
 
Ed Kearns said:
<snip>

Testtext.TIFF (text scanned by scanner): ok for OmniPage Pro 5.0
SubFileType (1 Long): Zero
ImageWidth (1 Short): 1176
ImageLength (1 Short): 2681
XResolution (1 Rational): 300
YResolution (1 Rational): 300
BitsPerSample (1 Short): 1
SamplesPerPixel (1 Short): 1
Compression (1 Short): Uncompressed
Photometric (1 Short): MinIsWhite
RowsPerStrip (1 Short): 2681
StripByteCounts (1 Long): 394107
ResolutionUnit (1 Short): Inch
Orientation (1 Short): TopLeft
PlanarConfig (1 Short): Contig
StripOffsets (1 Long): 378


100_0085bw.tiff: Not ok for OmniPage Pro 5.0
SubFileType (1 Long): Zero
ImageWidth (1 Long): 8467
ImageLength (1 Long): 5642
BitsPerSample (1 Short): 1
Compression (1 Short): Uncompressed
Photometric (1 Short): MinIsWhite
StripOffsets (1 Long): 186
SamplesPerPixel (1 Short): 1
RowsPerStrip (1 Long): 5642
StripByteCounts (1 Long): 5974878
XResolution (1 Rational): 300
YResolution (1 Rational): 300
ResolutionUnit (1 Short): Inch

If my math is correct that image is about 28" x 19" (8467 x 5642 @ 300
dpi). I think that's too big for Omnipage. If the image is indeed smaller
than that, then something has screwed up the header data.

Don
 
Omnipage should be able to handle Group4 compression. Group4 is the
best compression method available for black-and-white data, so use it if
you care about image size.
If my math is correct that image is about 28" x 19" (8467 x 5642 @
300 dpi). I think that's too big for Omnipage.

That's what I'd think too. If this is an entire page from a newspaper
or something with complex text layout, you'd be much better off
splitting the image into smaller parts *anyway*, because Omnipage's
auto-locate feature never splits the page up properly if the layout is
complex.
 
Ed Kearns said:
<snip>

Testtext.TIFF (text scanned by scanner): ok for OmniPage Pro 5.0

If my math is correct that image is about 28" x 19" (8467 x 5642 @ 300
dpi). I think that's too big for Omnipage. If the image is indeed smaller
than that, then something has screwed up the header data.

Don
That is correct! I didn't think about that as a problem! I reduced it and it
opens fine! Now the OCR translation isn't too good, but now I can try other
techniques to improve the image before submitting it to OmniPage!
Thanks again!
Ed
 
Back
Top