Scanning alot of 35mm slides

  • Thread starter Thread starter Chris Schomaker
  • Start date Start date
The range of solutions, both in price and complexity, varies greatly too.
So far, most of the discussion here has been on single-user applications to
archive photos. The example commonly used is a home user who wants some
way to archive their images and still have it in a useable, stable format
10 or 20 years or more from now.
ThumbsPlus can be single user or networked. And there are many image
companies using the networked version. because TP uses an industry
standard database format it is also unlikely that the files would not
be able to be accessed further down the line.
 
Mr. Grinch said:
The range of solutions, both in price and complexity, varies greatly too.
So far, most of the discussion here has been on single-user applications to
archive photos. The example commonly used is a home user who wants some
way to archive their images and still have it in a useable, stable format
10 or 20 years or more from now.
......

I'm no database expert for sure, but the database problem doesn't seem
insurmountable, since most databases can be exported to something more
primitive--as long as you know in advance that you WANT to do this,
before the software becomes obsolete or unusable for other reasons. I'm
using IMatch (because it's cheap and it does all the things I want to do),
and the following quote comes from the IMatch FAQ:

=Exporting your data
=All the data stored in the IMatch database can be exported in various ways,
=either as XML or text. You can export your metadata (comments, annotations,
=categories, and the schema database itself) at any time using the built-in
=export functions.
=Using the built-in scripting language, you can export your image database
=contents in even more ways, to create custom output formats or to feed even
=the most exotic applications.

=Since IMatch scripting is COM-compliant you can access all
=ODBC-compatible databases using the built-in database interface.
=Hence transferring data from IMatch to an SQL database or another
=database system is no problem.

I haven't actually done this, of course--I'm still working on getting
my 40,000 slides scanned. The big question of how much non-image
information to put with them is still ahead. (But IMatch does have
built-in IPTC and EXIF editors, if one of those is your method of choice.
I'm inclined to stick with categories for the moment.)

Gary Hunt
 
Gary L Hunt said:
"Mr. Grinch" <[email protected]> wrote in message
.....

I'm no database expert for sure, but the database problem doesn't seem
insurmountable, since most databases can be exported to something more

It basically, (in database vernacular) depends on the "normal form".
As long as the database is in first normal form, or even second they can
*usually* be easily exported.

Beyond one or two tables in a true relational database and exporting to a
text file in column, space/tab, or comma delimited form (using columns,
spaces, or commas to seperate the fields)
it can get a bit confusing without a score card, yet all of the tables can
be exported. Recombining them and going back to the original relationships
is something else.
With care and and possibly a lot of time a standardized set of tables can be
reconstructed into the original relational database.

If I remember correctly, the first normal form means the tables and fields
are such that the table can be deconstructed and reconstructed to the
original form. (over simplified statement and we had to learn the darn
things out to 8th or 9 th normal form)
primitive--as long as you know in advance that you WANT to do this,
before the software becomes obsolete or unusable for other reasons. I'm
using IMatch (because it's cheap and it does all the things I want to do),
and the following quote comes from the IMatch FAQ:

I think, as is mentioned elsewhere that recovery from a database, as long as
it is only one or two tables would be relatively easy and I know of no
modern databases that are not backward compatible at least for several
versions. IE, the new versions can read the old ones even if the old can
not read the new. (unfortunately)
=Exporting your data
=All the data stored in the IMatch database can be exported in various ways,
=either as XML or text. You can export your metadata (comments, annotations,
=categories, and the schema database itself) at any time using the built-in
=export functions.
=Using the built-in scripting language, you can export your image database
=contents in even more ways, to create custom output formats or to feed even
=the most exotic applications.

=Since IMatch scripting is COM-compliant you can access all
=ODBC-compatible databases using the built-in database interface.
=Hence transferring data from IMatch to an SQL database or another
=database system is no problem.

I haven't actually done this, of course--I'm still working on getting
my 40,000 slides scanned. The big question of how much non-image

40,000? I hope you are young. You are looking at a substantial project
that could easily run a year or more.
information to put with them is still ahead. (But IMatch does have
built-in IPTC and EXIF editors, if one of those is your method of choice.
I'm inclined to stick with categories for the moment.)
In Windows XP there is a data form attached to each file allowing the entry
of the author/photographer, subject, date, remarks, and ... I forget. I'd
have to go look it up as to what all can be put in the additional
information...Oh, one more is key words.

Agent is ailing right now so I'm back to OE for the time being. That means
I've lost part of this thread and remarks.

Busy night. I've received 7 viruses in the last three hours. (all have been
flushed into that great bit bucket in the sky) It looks like a bunch of
infected computers out there. Actually, I'm receiving 2 or 3 every 7
minutes.

Roger Halstead (K8RI, EN73 & ARRL Life Member)
N833R, World's Oldest Debonair (S# CD-2)
www.rogerhalstead.com
 
CSM1 said:
The name of the program is ThumbsPlus 6.0 Pro.
EXIF and IPTC information can be written into the image files.

how can I write EXIF information into a JPG file?

Wolfgang
 
Thank you for the link. It's a good starting point for people who are
deciding to digitize their photos.

You might spend a section or two on scanner selection and resolution
selection.

First, thanks. If I can keep it in memory long enough I'll address
the storage situation and I'll make a few remarks here, but I want to
address the data base issue, but it looks like that will be a separate
post. I'm trying to keep all these discussions for points to include
ion the http://www.rogerhalstead.com/scanning.htm page.

One thing I want to add: this is the type of work I used to do as a
project manager for a large corporation. I was the project manager for
implementing an FDA Validated Laboratory Information management System
(LIMS). This had to meet all FDA requirements as well as records
retention for corporate records and the corporate records retention
plan.
As far as archiving for longest possible storage goes, I've been doing
some thinking on this. The main point which you share, and which can't
be stressed enough, is that all storage formats have their limitations,
so the user in the end has to be aware of the care of their particular
format. The lifetime of a given format can actually be two different
things. It might be limited by deterioration. But it can also be
limited by obsolescence of the hardware or format, even if it has not
deteriorated.

To some degree, deterioration of any of the digital formats can be
managed by use of error correction in the archiving process. Some
formats have built-in error correction, but you can add more with various
data formats. You can add a lot of redundancy this way.

The problem is few end users have applications, or hardware that uses
error correction. Checksum in this case is not sufficient, nor is
mirroring. It requires complete backups that are compared to the
original files. However for most practical purposes, using a rolling
backup with two copies is usually sufficient. It would be nice to
have a simple file comparison application that could be set to do a
byte level comparison between backups and the originals, and/or
between each other. This is time consuming, but can more or less, be
automated.

Maybe I'll play around with that in either VB or V C++, or Delphi. I
happen to like Delphi as it's easier to use. <:-))
Obsolescense of a format can't be helped though, other than keeping
around redundant hardware to read that format.

I'd say it's easier to migrate to new media than to try to maintain a
format that is going obsolete.
My own thoughts are that out of the various digital formats available,
the common ones are optical disk (dvd, cd), magneto optical, magnetic
tape, and magnetic disk. Of those formats that are easy to obtain, I
would lean towards magnetic disk as being the most stable that I've used

Although magnetic (HDs ) are fast, high density, high capacity, and
very reliable they do not provide well for data integrity. Nor is the
"shelf life" of magnetic media considered reliable enough to be
considered archival. It can be used as such if it is update, or
rather "refreshed* on a regular basis. Unfortunately it is the high
frequency of updates and refreshing whey the data integrity suffers,
bearing in mind that most data integrity failures are human caused.
in practice, but maybe some people have thoughts as to other digital
formats that are more stable. Of magnetic disks, SCSI seems to be a
format that's survived and will continue to be around for some time.

SCSI is not a format but rather a protocol for transferring data and
control of the drives. Currently the new kid on the block that is
receiving good ratings is the serial drive. This is not the same as
the serial port. In addition, the serial RAIDS all use the same
protocol whereas the IDE and EIDE RAIDS do not. So you can take a
drive from one serial RAID and put it into another. This capability
is not nearly so common with the IDE and EIDE RAIDS.
Also, if it comes down to data recovery, in my experience the best
recovery resources (like Ontrack) and best results are had for recovering
data off drives vs tape or optical. So if I had to pick something today,
I'd use RAR or PAR to archive the images with recovery data to an
enterprise class SCSI drive. If I was really serious about being able to

With proper backups the use of these services should never become
necessary. They are available to retrieve data when all else fails
and they are expensive. Data recovery is very important and backups
should eliminate the need for tedious, time consuming and expensive
services.

There are relatively simple applications available to recover data
from FAT 16, FAT 32 and NTFS. I believe several are free, or at least
inexpensive. They don't care whether the drive is SCSI, IDE, EIDE, or
NTFS. Now all I need is for some one to remind me of the names of
these applications.
get the data off 20 years from now, I'd get two PCs with SCSI to make
sure I could read the data off, and duplicate the drives. If I

Welll... remember it's the drive format and not the type. It matters
not whether they be SCSI, IDE, EIDE, or serial. What matters is the
type of FAT, or NTFS. You can easily transfer data from an SCSI drive
to an EIDE. When you upgrade a computer, you just transfer the files
to the new drive (after backing them up)
experienced SCSI drive failure, I feel my best bet would be to get the
recovery services of someone like Ontrack. I'd be tempted to call them
and ask in more detail what their expected ability to recover various
formats over time might be.

Having a good backup is far cheaper and much faster.
Take it all with a grain of salt. I do. This topic just got me thinking
and I appreciate the info shared and the link.

You've been keeping me thinking <:-))
That and I should know better than to type this large a post, let
alone doing it on a computer with a scanner running that tends to
crash the whole computer.

Roger Halstead (K8RI & ARRL life member)
(N833R, S# CD-2 Worlds oldest Debonair)
www.rogerhalstead.com
 
A few thoughts on databases and their recovery in general as I think
it might help to understand how they work when used in photography.

I'm going to get a bit on the basic side here, but I think it might
help some reading this.

Relational databases consist of a series of files called tables. They
are called relational as fields in one table will point to (are linked
to) records in other tables. IE, table one is related to table two
through field 3 where for example field 3 is the photographer's name
or it could be key words. Doing this on key words can greatly speed
searches.

It may a bit of an over simplification, but Lets say table one is the
table containing the image IDs and information such as you can bring
up by right clicking on an image and selecting properties. In table
one you create a record that has the field names match the field names
in the properties window. From there it is relatively easy to write a
routine to import the contents of the Properties into the record. OTOH
most databases will already do this without the user having to do more
than click on a few buttons.

Now you create another table with the photographer's names. In the
case of a family this might only be 2 or 3 individuals. Each record
would contain the desired information about the photographer. Were
you cataloging photos for a number of photographers as in a museum, or
show this would make a lot more sense.

But for now we have one table that contains the photo IDs with the
pertinent information and the photographers name. We have a second
table that contains the information on the photographers.

The next step is to identify and activate the link between the name
field for the photos and the table containing the photographers. NOW
the database is relational in the two tables are related.

IF we add a new photo to the one table and it's by a new photographer,
the database will automatically open a new record for that
photographer so you can fill in the information.

There are several ways to back up the database. It can be backed up as
a whole, or the individual tables can be exported. You can export the
tables in a number of fashions, but they will be text files unless you
back up the table in its entirety. Some databases can get pretty
confusing at this point.

You have the most common which is comma delimited, then space
delimited, columnar delimited, tab delimited, and space delimited.
These are the most common and should be pretty much self explanatory.
For instance the comma delimited file has a comma separating each
field from the next. In columnar delimited each field is located in a
column dedicated to that field only.

Having gone this far we can now export all the record in the photo
table as text that is comma delimited. We can then export the record
from the photographers table in the same manner.

We now have two text files that are completely independent of each
other and easily readable. It should also be readily apparent if you
wish to create a database using these two files how they are related.
You create a new data base, create the tables and import the each file
to the proper table. The only thing left is to recreate the link
between the tables.

Programs such as "Thumbs Plus" do this work for you and a number of
other things as well.

Sooo... It's easy to find the information and it's easy for databases
to import it. You could conceivably create a table for each of the
image properties fields including remarks. Any time the information
for multiple images is the same in a particular field that field can
become a link to another table containing the information rather than
having to write the information into that field for every image.

OTOH the more tables the photographer uses the more difficult it
becomes to recreate the original database from a group of text files.
Incidentally these files are usually called flat files in that they
are a flat text file that contains a number of lines of text and
connect to nothing else. They are "stand alone".

As an aside, in the "old days" they used to take a series of tables in
VMS that were flat files and create links to other tables in the
search statements.. They served the same purpose as a relatively
simple relational database, but were called "Forced Relational
databases". I've seen them with more than a 100 tables, some of which
had well over a hundred fields, while the database contained millions
of records. The links, which really didn't exist" in the relational
sense, were in SQL statements which might be a bit complicated to
explain here and I'm not sure I remember them all..

Now you know whey I'm trying to address this *stuff* on the web
page<:-))

Roger Halstead (K8RI & ARRL life member)
(N833R, S# CD-2 Worlds oldest Debonair)
www.rogerhalstead.com
 
Roger Halstead said:
One thing I want to add: this is the type of work I used to do as a
project manager for a large corporation. I was the project manager for
implementing an FDA Validated Laboratory Information management System
(LIMS). This had to meet all FDA requirements as well as records
retention for corporate records and the corporate records retention
plan.

Yeah, it's a real eye opener when you have to develop a document/image
management system like that. I'm sure a lot of lessons were learned!
The problem is few end users have applications, or hardware that uses
error correction. Checksum in this case is not sufficient, nor is
mirroring. It requires complete backups that are compared to the
original files. However for most practical purposes, using a rolling
backup with two copies is usually sufficient. It would be nice to
have a simple file comparison application that could be set to do a
byte level comparison between backups and the originals, and/or
between each other. This is time consuming, but can more or less, be
automated.

Maybe I'll play around with that in either VB or V C++, or Delphi. I
happen to like Delphi as it's easier to use. <:-))

What comes to mind first is PAR2, which generates both the file accuracy
check as well as create and process error correction data, in whatever
ratio you want. Other methods which just check for accuracy include SFV
and MD5.

This reminds me of another project I'm looking at, which is archiving my
CDs. A few off the shelf products can produce lossless copies of your
CDs, and allow you to save them back on to CDs or any other media,
complete with accuracy checks and error-correcting data. All done with
software off the web. Now scanned images are a little different than CD
digital audio, but still the process of adding and using some error
detection and correction can be the same for any digital file. The only
questions are which error detection/correction to use, and how much
redundancy do you want.
Although magnetic (HDs ) are fast, high density, high capacity, and
very reliable they do not provide well for data integrity. Nor is the
"shelf life" of magnetic media considered reliable enough to be
considered archival. It can be used as such if it is update, or
rather "refreshed* on a regular basis. Unfortunately it is the high
frequency of updates and refreshing whey the data integrity suffers,
bearing in mind that most data integrity failures are human caused.

My own thought is not to rely on the media for integrity. For integrity
I would rely on more redundant data. But still, picking the media with
the best shelf life is important, I just don't know which media rates the
highest.
SCSI is not a format but rather a protocol for transferring data and
control of the drives. Currently the new kid on the block that is
receiving good ratings is the serial drive. This is not the same as
the serial port. In addition, the serial RAIDS all use the same
protocol whereas the IDE and EIDE RAIDS do not. So you can take a
drive from one serial RAID and put it into another. This capability
is not nearly so common with the IDE and EIDE RAIDS.

I just pick SCSI as an interface because it has (so far) had the best
compatibility over time. IE new SCSI cards can still read the old SCSI
drives. With IDE/ATA you've got several changes to bios and cabling
already and it just makes it more confusing to get at the media, compared
to installing a known quantity (a SCSI card) to get at a known drive.
With proper backups the use of these services should never become
necessary. They are available to retrieve data when all else fails
and they are expensive. Data recovery is very important and backups
should eliminate the need for tedious, time consuming and expensive
services.

No problem there. Some people are equipped to take care of their backup
needs. To me, that means not just doing backups, but the ability to
carry out restores when needed. Some people aren't equipped to do this
or don't want to, for any reason. In those cases, the data recovery
services are where I point them to. Free support from me gets to be a
little tiring ;-)
There are relatively simple applications available to recover data
from FAT 16, FAT 32 and NTFS. I believe several are free, or at least
inexpensive. They don't care whether the drive is SCSI, IDE, EIDE, or
NTFS. Now all I need is for some one to remind me of the names of
these applications.

Yeah, I've noticed some people are talking about them in the newsgroup
comp.sys.ibm.pc.storage. I think it depends on what the data is worth to
you. If it's worth a lot and time is important, I'd pay a service to get
the job done right the first time. Of course the software is limited in
what it can do. It can't replace board electronics, motors, or fix spin-
up issues. The good data recovery services can.
Welll... remember it's the drive format and not the type. It matters
not whether they be SCSI, IDE, EIDE, or serial. What matters is the
type of FAT, or NTFS. You can easily transfer data from an SCSI drive
to an EIDE. When you upgrade a computer, you just transfer the files
to the new drive (after backing them up)

Some people might argue that enterprise class devices have lower MTBFs
etc. Not that I would bet my money on it if the data were important to
me. Typically the cheaper drives have much lower ratings though. To
some people it's important, but not everyone. You can even find
enterprise class ATA drives now with different ratings from their retail
cousins.
Having a good backup is far cheaper and much faster.

Having both options open is better still, depening how important the data
is. I've run into the situation where the backup did not restore as it
should have, the people responsible for the backup were absentee, and my
only option was data recovery. It cost a bit but nothing compared to
what the data was worth. I was a satisfied customer of Ontrack that
week! Over the years I've had lots of cases where someone could not get
data off a drive and I've been asked to help. I've tried out the various
tools and tricks and been able to get data off the drives in many cases,
but in others where the data and time were critical, my best suggestion
was to use a good data recovery service.

No matter how many times I've told friends and various people how to
backup and safeguard their data, they almost never do it. The only
person I've had any luck with is my sister, who lost a lot of Maya
renderings and learned the hard way about backups. You can't get people
to help themselves sometimes, but you can give them the name of some data
recovery experts. It's that or spending long nights doing "favours" for
people ;-)
You've been keeping me thinking <:-))
That and I should know better than to type this large a post, let
alone doing it on a computer with a scanner running that tends to
crash the whole computer.

Again, thanks for the link, it's a great primer for someone like me who's
looking to archive their old photos.
 
I have a Mac G4. When I scan my pictures and tweak them in Photoshop 7,
then save them as TIFF files, I am asked to choose either IBM PC or Mac
Byte order. I usually choose Mac.
Now, when I make jpegs of these photo's to save to disc and send
to others, some of them have complained that the photo's are much too
big on their monitor and print out too big. One person was able to fix
the size on the monitor but not on the print out. All of the people I
refer to have PC's
Does anyone in this newsgroup have any experience with this
problem? Can anyone solve it for me? If the byte order is the problem,
is there any way that I could change the byte order on hundreds of these
photo files at one time, or would I have to go through and re-save every
TIFF file under a different byte order and make new jpeg files?

thanks for any help,
Bill
 
Before saving as Jpeg in Photoshop, Use Image | Image Size and
reduce the height/width.

Byte order is of no concern.
 
"Bill Bojanowski" posted:
"... If the byte order is the problem, ..."

"Byte order" is NOT the problem.

When you go through the step "... I make jpegs of these photo's to save to disc and send
to others, ..." you need to resize the pictures to a smaller size prior to saving them.

I would suggest resizing to something like 4" x 6" at 72 dpi, then saving your JPEG file.
As long as you have your full size (unaltered) TIFF files, you have not lost anything.

Oh ... and BTW ... regarding your statement:
"... I am asked to choose either IBM PC or Mac Byte order. I usually choose Mac. ..."

You can safely choose either. Actually, ANY program that has the capability to handle TIF
/ TIFF images ... by definition (as "in the TIFF spec") ...MUST be able to handle BOTH
byte orientations.
 
Mac McDougald said:
Ummm, ThumbsPlus 6.0 Pro?

not possible! I own Thumps Plus 6.0 pro., only comments in EXIF fields can
be entered. Enterung make, model, f-stop etc. is not possible
Lots of other programs too;

please name them, please

Exifer is a popular freebie.

yes, I know also. But as in Thumbs Plus there is no way to enter the fields
described above

I scan slides taken with my F5. all data are availabe from my MF-28 and
retrieved by photo secretary software. I'm just looking for a way to
 
Mac McDougald said:
Ummm, ThumbsPlus 6.0 Pro?

not possible! I own Thumps Plus 6.0 pro., only comments in EXIF fields can
be entered. Enterung make, model, f-stop etc. is not possible
Lots of other programs too;

please name them, please

Exifer is a popular freebie.

yes, I know also. But as in Thumbs Plus there is no way to enter the fields
described above

I scan slides taken with my F5. all data are availabe from my MF-28 and
retrieved by photo secretary software. I'm just looking for a way to enter
these values into the correct exif fields.

Wolfgang
 
Wolfgang Exler said:
not possible! I own Thumps Plus 6.0 pro., only comments in EXIF fields can
be entered. Enterung make, model, f-stop etc. is not possible


please name them, please



yes, I know also. But as in Thumbs Plus there is no way to enter the fields
described above

I scan slides taken with my F5. all data are availabe from my MF-28 and
retrieved by photo secretary software. I'm just looking for a way to enter
these values into the correct exif fields.

Wolfgang

With ThumbsPlus 6.0 Pro you can create User Data fields with the EXIF names
and store the information in the ThumbsPlus database. If the files have exif
information, re-making the thumbnails will populate the User Data fields if
the correct field names are used and the field is a text field with at least
20 characters.

With ThumbsPlus 6 and User data fields, you can write anything in TEXT.

Some EXIF info and Thumbsplus 6
http://www.cerious.com/manual6/relnotes6.shtml#metadata

Some cameras do not use all of these fields.
Case is important.
EXIF Field names:
---
Aperture
BitsPerSample
Brightness
Colorspace
Compression
Copyright
DateTimeDig
DateTimeMod
DateTimeOrig
Distance
ExifHeight
ExifWidth
ExposureBias
ExposureProg
ExposureTime
FileSource
Flash
FocalLength
FocalResUnit
FreqResponse
Fstop
ImageDescr
ImageHeight
ImageWidth
ISOSpeed
LightSource
Make
MakerNote
MaxAperture
MeteringMode
Model
Orientation
PlanarConfig
RefBlackWhite
ResolutionUnit
SceneType
ShutterSpeed
UserComment
Version
XResolution
YResolution
--
(The following is from a message on news.cerious.advanced dated 06/28/2001)
Subject line Re: Copy EXIF to UDF
ThumbsPlus can now automatically extract information from EXIF JPEG files
and store it in database fields. To enable this, simply create user fields
with the same names as the EXIF fields. Of course, you will need to re-make
any thumbnails to populate the database. Below is a list of the most useful
field names. Note that not all camera manufacturers use all of the fields,
and many manufacturers have additional data stored in the "MakerNote" that
ThumbsPlus does not interpret. We will handle these on a case-by-case basis
and handle them if possible.

Aperture
ExposureTime
MaxAperture
BitsPerSample
FileSource
MeteringMode
Brightness
Flash
Model
Colorspace
FocalLength
Orientation
Compression
FocalResUnit
PlanarConfig
Copyright
FreqResponse
RefBlackWhite
DateTimeDig
Fstop
ResolutionUnit
DateTimeMod
ImageDescr
SceneType
DateTimeOrig
ImageHeight
ShutterSpeed
Distance
ImageWidth
UserComment
ExifHeight
ISOSpeed
Version
ExifWidth
LightSource
XResolution
ExposureBias
Make
YResolution
ExposureProg
MakerNote

Hope this helps!

Laura Shook
Cerious Software, Inc.
 
Roger Halstead said:
40,000? I hope you are young. You are looking at a substantial project
that could easily run a year or more.
Actually the secret seems to be that I waited until I retired to start,
although there is always the possibility of not living to finish:) But in
spite of all my gripes about Nikon's auto slide feeder, it has now
cranked out 9000 scans in 3 weeks without dying (and the jamming is
down to maybe once per day since I "operated" on it with a file.) The big
problem, of course, is that as time goes on, I'm eventually zeroing in
on the color correction issues, but there is no way I'm going to go back
and do the early ones over again. (The other problem is that after the
first week, you hear the slide feeder mechanism and the stepper motor in
your dreams.)

Does anyone know the limits of the "auto exposure" feature in NikonScan?
My experience is that with really underexposed slides that produce a
histogram with NO pixels above 150 or so, I can go back and crank up the
analog gain and pull some detail out on a re-scan. I know NikonScan
is varying the analog gain, because I can hear it slow down sometimes on
dark slides. But it clearly is not using anywhere near the full range of
the analog gain control, which means a lot of Photoshop levels work to
make these images even visible.

Gary Hunt
 
Back
Top