Scanning course books help

  • Thread starter Thread starter eric veller
  • Start date Start date
E

eric veller

Dear Scanners,

I am new to scanning and need some advice from experienced experts,
because I am considering investing a lot of money and time in a
scanning system, and there is no way for me to know ahead of time how
good the system would be in say 6 months.

---------------------

Here is what I want:

I have a lot of textbooks (and other books) I would like to scan into
searchable text format. That way I can quickly find info I am looking
for. I do not need diagrams. I do not even think I need font
information, maybe.

I want to be able to search course texts easily and quickly.

I would only use textbooks that I had purchased, and would strip the
covers and cut the bindings first to make them scannable. I plan to
scan about 10000 pages a semester (so at least 20000 pages a year).

For the hardware, I was thinking about something like the HP Scanjet
8290.

1. Does this sound like a reasonable idea?

2. How good are the OCR programs on such text? What is a good one to
buy? How much disk space am I looking at?

3. Any hardware suggestions? Clearly a duplex feeder, speed, and
reliability are needed, but no color capability is needed and no
high-res is needed since its just for OCR.

4. If you have other suggestions or advice that's fine.
 
eric said:
Dear Scanners,

I am new to scanning and need some advice from experienced experts,
because I am considering investing a lot of money and time in a
scanning system, and there is no way for me to know ahead of time how
good the system would be in say 6 months.

---------------------

Here is what I want:

I have a lot of textbooks (and other books) I would like to scan into
searchable text format. That way I can quickly find info I am looking
for. I do not need diagrams. I do not even think I need font
information, maybe.

I want to be able to search course texts easily and quickly.

I would only use textbooks that I had purchased, and would strip the
covers and cut the bindings first to make them scannable. I plan to
scan about 10000 pages a semester (so at least 20000 pages a year).

For the hardware, I was thinking about something like the HP Scanjet
8290.

1. Does this sound like a reasonable idea?

2. How good are the OCR programs on such text? What is a good one to
buy? How much disk space am I looking at?

3. Any hardware suggestions? Clearly a duplex feeder, speed, and
reliability are needed, but no color capability is needed and no
high-res is needed since its just for OCR.

4. If you have other suggestions or advice that's fine.

The more technical the book, the more likely it is to use fairly
complicated mathematics. All the books I've found which go into the
nitty gritty of scanning are of this nature. OCR programs can't
currently decipher mathematics. If you were to transcribe it in TeX,
you could then search it, but you are not likely to do that. If you
leave it in graphic form, it would be the same as reading it in a book.
I suggest you keep some of your textbooks.
 
I have scanned documents into Adobe Acrobat and used the Paper Capture
feature in Searchable Image mode. This keeps the scanned image but places
OCR text behind it for searching. It can only capture 50 pages at a time
though, and the files are quite large. For more pages you would have to buy
Acrobat Capture, which I think is $399.

Mike
 
eric veller said:
Dear Scanners,

I am new to scanning and need some advice from experienced experts,
because I am considering investing a lot of money and time in a
scanning system, and there is no way for me to know ahead of time how
good the system would be in say 6 months.

---------------------

Here is what I want:

I have a lot of textbooks (and other books) I would like to scan into
searchable text format. That way I can quickly find info I am looking
for. I do not need diagrams. I do not even think I need font
information, maybe.

I want to be able to search course texts easily and quickly.

I would only use textbooks that I had purchased, and would strip the
covers and cut the bindings first to make them scannable. I plan to
scan about 10000 pages a semester (so at least 20000 pages a year).

For the hardware, I was thinking about something like the HP Scanjet
8290.

1. Does this sound like a reasonable idea?

2. How good are the OCR programs on such text? What is a good one to
buy? How much disk space am I looking at?

3. Any hardware suggestions? Clearly a duplex feeder, speed, and
reliability are needed, but no color capability is needed and no
high-res is needed since its just for OCR.

4. If you have other suggestions or advice that's fine.

The best source for Scanning information is found at:
http://www.scantips.com/ Wayne Fulton also has books for sale.
 
Back
Top