Announcing 'Distributed Scanners' discussion group (Scanning public domain texts)

Jon · Dec 19, 2005

[Feel free to redistribute this announcement to other forums where
on-topic, such as scanning, graphics, books and publishing, library
and archives, etc. Thanks.]

.... Let's digitally scan all the world's public domain books! ...

Everyone,

Several people privately expressed interest in the "Distributed
Scanners" (DistScan) idea I recently outlined to the Book People
forum. So, I've taken the next step and created a Yahoo discussion
group to further explore this idea -- to see if it has any legs. You
are invited to join -- refer to the info at the end of this message.)

To summarize the idea: Is there interest and need for a
volunteer-driven, large-scale distributed scanning project of public
domain books and other documents modeled after (where applicable)
Distributed Proofreaders?

The full group description, and the current expanded summary of the
idea (which will undoubtedly change and improve over time as we better
understand the various issues) is given at:

http://groups.yahoo.com/group/distscan/

To be clear, this group does not actually launch the project, but
rather serves only to bring together sharp, like-minded people to
explore the idea -- to see if there is a "working formula" that makes
sense, and if we can assemble a core group of people with the needed
skill sets and interest to be able to successfully launch the project.

The goal, of course, is to accelerate the high-quality scanning of
public domain texts. It is not intended to be competitive with other
projects to scan the public domain, such as those managed by the
Internet Archive (e.g. OCA), but rather to augment and possibly even
work in cooperative fashion with those projects (including
Distributed Proofreaders.)

Please read carefully the group description at the above URL. If you
wish to comment on this message, I encourage you to join the group and
post your comment there. Or, email it to me in private and I may
post it to the group (with your identity removed unless requested
otherwise).

Anyone interested in scanning the public domain (whether a private
individual or representing an institution) is invited to participate.
We definitely need people with expertise in a very wide range of
areas. Since DistScan will likely have many components, you are
probably expert in one of them! Do join and contribute to the
discussion.

Thanks!

Jon Noring

(p.s., there are three ways to subscribe to the DistScan group:

1) Use your YahooID and click on the "Join This Group!" button at the
above URL.

2) Send a blank email to: (e-mail address removed)

(No need to get a YahooID to subscribe this way.)

3) Ask me to subscribe you with the email address you want to use. No
need to get a YahooID to subscribe this way.)

Robert Feinman · Dec 19, 2005

[Feel free to redistribute this announcement to other forums where
on-topic, such as scanning, graphics, books and publishing, library
and archives, etc. Thanks.]

... Let's digitally scan all the world's public domain books! ...

Everyone,

Several people privately expressed interest in the "Distributed
Scanners" (DistScan) idea I recently outlined to the Book People
forum. So, I've taken the next step and created a Yahoo discussion
group to further explore this idea -- to see if it has any legs. You
are invited to join -- refer to the info at the end of this message.)

To summarize the idea: Is there interest and need for a
volunteer-driven, large-scale distributed scanning project of public
domain books and other documents modeled after (where applicable)
Distributed Proofreaders?

The full group description, and the current expanded summary of the
idea (which will undoubtedly change and improve over time as we better
understand the various issues) is given at:

http://groups.yahoo.com/group/distscan/

To be clear, this group does not actually launch the project, but
rather serves only to bring together sharp, like-minded people to
explore the idea -- to see if there is a "working formula" that makes
sense, and if we can assemble a core group of people with the needed
skill sets and interest to be able to successfully launch the project.

The goal, of course, is to accelerate the high-quality scanning of
public domain texts. It is not intended to be competitive with other
projects to scan the public domain, such as those managed by the
Internet Archive (e.g. OCA), but rather to augment and possibly even
work in cooperative fashion with those projects (including
Distributed Proofreaders.)

How does this differ from project Guttenburg which has been digitizing
books for years? If you have so much manpower at your disposal add your
contributions to this existing collection.

Jon · Dec 19, 2005

Robert said:
How does this differ from project Gutenberg which has been
digitizing books for years? If you have so much manpower at your
disposal add your contributions to this existing collection.

I appreciate your reply.

I am quite familiar with both Project Gutenberg (PG) and its ally
Distributed Proofreaders (DP). I've known Michael Hart for years as
part of the ebook community, and even attended the first Project
Gutenberg "face-to-face" meeting December 2003 at the Internet
Archive headquarters in San Francisco. I've contributed to DP as
well, and been in close consultation with the DP founders. I also
have met Brewster Kahle (head of the Internet Archive) several times
and am quite familiar with his various projects, including OCA.

So, with that "bio" out of the way, let me reply very briefly, then,
to your comment.

Project Gutenberg is focused on producing "structured digital
texts" (SDT) of public domain books. They have little interest in page
scans in and of themselves (which are images, not SDT) except as an
intermediary in the SDT production process.

DP also does book scanning, but at low-rez and only for SDT production
(they supply most of PG's etexts these days).

The Internet Archive manages a couple projects to produce high-quality
scans of books (primarily as part of OCA), but that is focused towards
large institutional collections and NOT using volunteers in any
meaningful capacity.

Thus, the interest in creating a volunteer-driven book scanning
project, Distributed Scanners. That is the topic of the post, to see
if the idea has legs, and if so, how should it be organized. I believe
there is a need, and it would certainly work with other existing
scanning projects, such as OCA. It would not work in a vacuum,
oblivious to all else that is going on.

Note that I *have* posted the DistScan group announcement to the PG
list already, since there are a few PG/DP volunteers interested in
the DistScan idea.

Again, thanks for your reply.

Jon Noring

Announcing 'Distributed Scanners' discussion group (Scanning public domain texts)

Jon

Robert Feinman

Jon