dig camera as book scanner --- again

  • Thread starter Thread starter ivowel
  • Start date Start date
I

ivowel

dear group: i asked a couple of questions about 3 months ago about
whether it was possible to use a digital camera as a book scanner.

http://groups.google.com/group/comp...read/thread/72d3a391de90bd8c/502859775eb1c2b3

the answer seemed to be "yes, at least in principle." the advice was
to try a cheap copy stand and a >5MP pixel camera.

of course, cheap as I am, before I drop the $1,500 for a good setup, I
decided to first try this with my 5MB Sony DSC-T7 home camera. I
experimented with shooting a page, just holding the camera in my hand.
I did not manage to succeed---I either had images that were too dim, or
too unfocused (too close for the lens, probably), or shiny reflections
off the paper. None of the images were good enough for OCR processing.

So, I suspect that I failed because I need

* good lighting, probably from something like a copy stand that has
diffuse light coming from each side.
specific recommendations? should the light bulbs be of a
particular type?
* a particular lens that can do reasonably close up shots.
specific recommendations?
* a particular exposure setting (do dig cameras even have this?)
recommendations?

in addition, when I look at the "big boys" (eg, the $8,000-$20,000 book
scanners from Minolta), they have compensation for the book spine and
curvature. It would seem to me that this is a software function, not a
hardware function. is there not an OCR package that can do this, too?

* are there any small shops that sell book scanners rigged together
from standard equipment?

* if someone has example images of what can come out of a hand-rigged
scanner setup (possibly with and without OCR applied), would it be
possible to post a link to it? I would love to see it.

Help appreciated.

sincerely,

/ivo welch
 
dear group: i asked a couple of questions about 3 months ago about
whether it was possible to use a digital camera as a book scanner.

http://groups.google.com/group/comp...read/thread/72d3a391de90bd8c/502859775eb1c2b3

the answer seemed to be "yes, at least in principle." the advice was
to try a cheap copy stand and a >5MP pixel camera.

of course, cheap as I am, before I drop the $1,500 for a good setup, I
decided to first try this with my 5MB Sony DSC-T7 home camera. I
experimented with shooting a page, just holding the camera in my hand.
I did not manage to succeed---I either had images that were too dim, or
too unfocused (too close for the lens, probably), or shiny reflections
off the paper. None of the images were good enough for OCR processing.

So, I suspect that I failed because I need

* good lighting, probably from something like a copy stand that has
diffuse light coming from each side.
specific recommendations? should the light bulbs be of a
particular type?
* a particular lens that can do reasonably close up shots.
specific recommendations?
* a particular exposure setting (do dig cameras even have this?)
recommendations?

in addition, when I look at the "big boys" (eg, the $8,000-$20,000 book
scanners from Minolta), they have compensation for the book spine and
curvature. It would seem to me that this is a software function, not a
hardware function. is there not an OCR package that can do this, too?

* are there any small shops that sell book scanners rigged together
from standard equipment?

* if someone has example images of what can come out of a hand-rigged
scanner setup (possibly with and without OCR applied), would it be
possible to post a link to it? I would love to see it.

Help appreciated.

sincerely,

/ivo welch
You can use a Digital Camera, but you must cross the T's and dot the I's to
do it.

That means no short cuts.
A copy stand is a must, Good lights are also a must.

Or you can use a tripod if the tripod will allow close to the table and
point the camera straight to the table.

Your 5 MP camera should be fine if it has a tripod mount. And a Manual Mode
where you can set the shutter speed and aperture to the correct exposure and
the focus to sharp focus, Auto Focus may be fine. You may or may not need a
Macro mode for close focus.

Copy stands are not that expensive. You can get a good one from B&H for less
than $90.
http://www.bhphotovideo.com

When you get to the B&H home page search for "copy stand" without the
quotes, then sort by price low to high.

I prefer Testrite stands. I don't think the Digital Pursuits $30 stand is a
good buy, it is too short.
For more money, you can get the copy stand with lights.

For lights, you can use two cheap clip-on from the hardware store or
Wal-Mart.
100 watt household bulbs will work just fine if your camera has a Tungsten
white balance. (Only necessary if you use color for pictures in the book).
 
thank you. I will buy a testrite stand.

what are the best specs for the digital camera? I presume I want high
res (color irrelevant for OCR). Is there a particular lens or focal
length that would be best? because I do not have it yet, I have full
choice. PowerShot S80 has 8MP, 28-100mm, and seems reasonably priced
(~$400). PowerShot PS1 seems similar, though bigger. (not sure this
is necessary for a digital camera.)

will the book spine give me trouble, or can the OCR software
compensate?

sincerely,

/iaw
 
thank you. I will buy a testrite stand.

what are the best specs for the digital camera? I presume I want high
res (color irrelevant for OCR). Is there a particular lens or focal
length that would be best? because I do not have it yet, I have full
choice. PowerShot S80 has 8MP, 28-100mm, and seems reasonably priced
(~$400). PowerShot PS1 seems similar, though bigger. (not sure this
is necessary for a digital camera.)

will the book spine give me trouble, or can the OCR software
compensate?

You might lay a heavy glass on the book to open it as flat as possible
without breaking the spine.

You do have to watch the reflection from the glass, just move the lights
until no reflection gets to the camera.

If the image is good and clean, the OCR software can compensate some.
OCR needs the equivalent of 300 dpi for best results.

Assuming a 8.5" x 11" book, and a 8 MP camera. (Powershot S80) the image is
3264x2448 pixels.
http://www.steves-digicams.com/2005_reviews/s80.html

2448 / 8.5 =288 dpi short dimension
3264 / 11 = 296 dpi long dimension.

So that camera will give a fairly good OCR image. It would be better on
smaller books.
 
thank you. yikes---the S80 cannot send raw images, and therefore
cannot be used. so, back to the Pro1 or the Sony DSC-R1.

would I use a long (1second?) exposure setting and no flash when
mounted?

have you ever tried this? do you have some sample images and OCR?

regards,

/iaw
 
unfortunately, I don't think this is the actual resolution. see, I
believe that the camera has 2448 pixels, of which 816 are R, 816 are B,
and 816 are G. I would guess this will lose some resolution---not
3-to-1, but maybe 2-1. so, I would guess this will likely be more like
192dpi. still, if good quality, it should be ok for OCR.

the powershot S80 cannot send raw images, only jpeg. out. (st..id
canon.) the powershot DSC-1 only has USB 1.1, which makes sending
large images so slow that it is out, too.

right now, I am checking into the sony dsc-r1 and into the canon rebel.
software is always the issue for me here...I need something that can
transfer images directly to a directory, instead of onto the internal
CF. I do not yet know whether either can do it. amusingly, what I
really need is a hi-res webcam, but those don't exist.

I hope to control the paper-glare issue with long exposures and just a
little indirect light. let's hope it will work.

regards,

/iaw
 
unfortunately, I don't think this is the actual resolution. see, I
believe that the camera has 2448 pixels, of which 816 are R, 816 are B,
and 816 are G. I would guess this will lose some resolution---not
3-to-1, but maybe 2-1. so, I would guess this will likely be more like
192dpi. still, if good quality, it should be ok for OCR.

the powershot S80 cannot send raw images, only jpeg. out. (st..id
canon.) the powershot DSC-1 only has USB 1.1, which makes sending
large images so slow that it is out, too.

right now, I am checking into the sony dsc-r1 and into the canon rebel.
software is always the issue for me here...I need something that can
transfer images directly to a directory, instead of onto the internal
CF. I do not yet know whether either can do it. amusingly, what I
really need is a hi-res webcam, but those don't exist.

I hope to control the paper-glare issue with long exposures and just a
little indirect light. let's hope it will work.

regards,

/iaw

According to the review:
http://www.dpreview.com/news/0508/05082205canons80.asp

The image size is 3264 x 2448 and it is Jpeg only.
That is the actual size of the 24 bit image or 16 million colors.

OCR software can use Jpeg images or at least Omnipage 15 can.

The spec says USB 2.0 so image transfer would be very fast.

For slow image transfer, a card reader will fix that.
You just plug in the memory card from the camera into a memory card reader,
which you can buy for as little as $10.
I need something that can
transfer images directly to a directory, instead of onto the internal
CF.

Do you mean that you want software to control the camera taking the image
and writing the image directly to your computer?

ZoomBrowser EX can operate the camera shutter remotely from the computer.
 
I think the jpeg algorithm works by blending pixels. its very good for
continuous small color changes, but not for imaging high-contrast
letters. yes, it could work, but it would seem like a waste of pixels.

yes, I want to control the whole setup from a computer. in this case,
one-button operation should be possible. I hit space, the camera takes
the image, deposits it into a directory, Omnipage pulls off the image,
and puts the finished document into another directory.

I am still going around with B&H about the testrite stand. they don't
have the lights for it in stock, which I think I should try with it,
given that I want to give it a good try. the lights are also direct,
and I wonder whether it would be better to diffuse them. Similarly, I
wonder whether a polarizing filter would be a good idea. anyone know?

regards,

/ivo
 
I think the jpeg algorithm works by blending pixels. its very good for
continuous small color changes, but not for imaging high-contrast
letters. yes, it could work, but it would seem like a waste of pixels.
You take the image the camera gives you, you don't have a choice in that
matter.
Other that buying a camera that has TIFF as output image.

If you set the camera to the lowest compression (a Fine image, larger file
size) the jpeg artifacts in the original image will be almost non-existent.

Jpeg in the original image is OK, the artifacts in Jpeg images build up when
you resave the image over and over. Which can be eliminated by saving the
original Jpeg image as a TIFF and only using the tiff copy for all editing
and manipulation.
yes, I want to control the whole setup from a computer. in this case,
one-button operation should be possible. I hit space, the camera takes
the image, deposits it into a directory, Omnipage pulls off the image,
and puts the finished document into another directory.

I am still going around with B&H about the testrite stand. they don't
have the lights for it in stock, which I think I should try with it,
given that I want to give it a good try. the lights are also direct,
and I wonder whether it would be better to diffuse them. Similarly, I
wonder whether a polarizing filter would be a good idea. anyone know?
Why not wait and see what your results are before you start over thinking
the problem.

Direct lights are just fine, the only thing you really have to pay close
attention to, is if you use a glass to weight down the book pages. You may
get reflection from the glass, you move the angle of the light to remove the
reflection.

regards,

/ivo
 
Back
Top