Reduce TIF File Size

  • Thread starter Thread starter Just_Buy
  • Start date Start date
J

Just_Buy

Wondering if anyone knows of a program that will reduce the file
size(i.e. kb) of a TIFF file. We have an application that scans in a
black and white image for the purpose of capturing data from the image
via OCR software. Once the data is extracted from the images the
images are then stored on a hard drive for future retrieval if needed.

The program would need to be able to reduce the file size in a batch
mode, there are about 7,000-10,000 per day.

Thanks
 
Just_Buy staggered into the Black Sun and said:
Wondering if anyone knows of a program that will reduce the file
size of a TIFF file.

Which compression method are you using? Do you know?
We have an application that scans in a black and white image

Is it using Group4 compression? It should be. Group4 is lossless for
black+white and compresses really well. 8.5x11" at 300 DPI averages
about 70-90K. I don't think you can get smaller than Group4 TIFF for
black+white data without doing a *lot* of specialized work. And then
you'd have to write specialized junk to read the custom format you
created. Not A Good Idea.
The program would need to be able to reduce the file size in a batch
mode

This assumes a Unix-like system with ImageMagick installed. Adapt if
you're not using a Real OS. Note that ImageMagick is available for
almost every OS, and so is bash.

#!/bin/bash
for tiff in *.tif ; do
new=`echo $tiff | sed -e 's/\.tif/-g4.tif/' `
convert -resolution 300 -units PixelsPerInch -monochrome -compress \
Group4 $tiff $new
rm -f $tiff
done

....adjust to taste. If you're already using Group4, this won't help,
and you'll have to do something else, like buy bigger disks.
 
Wondering if anyone knows of a program that will reduce the file
size(i.e. kb) of a TIFF file. We have an application that scans in a
black and white image for the purpose of capturing data from the image
via OCR software. Once the data is extracted from the images the
images are then stored on a hard drive for future retrieval if needed.

For monochrome, Group 4 compression is very effective. The 'convert'
program, part of the ImageMagick suite, can do it:

convert -monochrome -compress Group4 inputfile outputfile

I've only done this on Linux but I believe ImageMagick is available
for Windows too.

Note, however, that you'll lose any anti-aliasing. AA requires grey
scale and you can't do Group 4 compression on that.
The program would need to be able to reduce the file size in a batch
mode, there are about 7,000-10,000 per day.

ImageMagick runs fine in batch mode.
 
Why not scan to JPEG instead of TIFF? Sure, it's "lossy", but if you go
easy on the compression, the output is indistinguishable from TIFF, and
you can achieve an 80% size reduction.
 
Barry Watzman staggered into the Black Sun and said:

Please don't top-post. Message rearranged for easier reading:
Why not scan to JPEG instead of TIFF? Sure, it's lossy, but if you
go easy on the compression, the output is indistinguishable from TIFF,
and you can achieve an 80% size reduction.

clairissa:~/work/APS$ identify 00011.0.tif
00011.0.tif TIFF 1351x2151 1351x2151+0+0 PseudoClass 2c 1-bit 45.2129kb
clairissa:~/work/APS$ convert 00011.0.tif 00011.0.jpg
clairissa:~/work/APS$ ls -lh 00011*
-rw-r--r-- 1 me users 736K Apr 13 09:02 00011.0.jpg
-rw-r--r-- 1 me users 46K Mar 11 2006 00011.0.tif

....80% size reduction? On a black-and-white image? With default
quality (85/100)? Riiight. At quality 10/100, there are tons of JPEG
artifacts, which would cause OCR engines to screw up, and the image is
193K. JPEG has many uses, but Group4 TIFF totally pwns JPEG for
black-and-white images.

Unless the OP meant "grayscale", which he might've. People who are not
familiar with image formats and terminology can make incorrect
statements. A grayscale image saved as 85/100 JPEG will be smaller than
the same image saved as LZW TIFF. *If* the OP's OCR engine reads JPEGs
(some don't), and *if* the OP was scanning in grayscale, JPEGs would
save space.
 
Screw you. I almost always top post and will continue to do so. If you
don't like it, don't read my posts.
 
Barry Watzman said:
Screw you. I almost always top post and will continue to do so. If you
don't like it, don't read my posts.


Like most people using Usenet, I don't like top posting either. If
you won't make the small effort needed to conform, your messages are
unlikely to be worth reading, so welcome to my kill file.

And, I suspect, many others'.
 
Hey, it's my message. You can conform to me. When I read your
messages, I'll conform to you.
 
Screw you. I almost always top post and will continue to do so. If you
don't like it, don't read my posts.

OK. Plonk. Don't bother to reply. I won't see it.<G>
 
Back
Top