Long term archiving??

  • Thread starter Thread starter A.F. Hobbacher
  • Start date Start date
A

A.F. Hobbacher

What is the best way for a long term archiving of scans?? Media,
formats, strategy??

Please give advice

Regards AFH
 
What is the best way for a long term archiving of scans?? Media,
formats, strategy??

Please give advice

Regards AFH

Media:

Firewire hard drive; DVD

Formats; Tiff

Strategy: 2 copies of every image - one on Firewire drive, one on DVD.
Also, proper back up software to back up intermediate images you're
working on, seamlessly, in the background, without intervention.
 
A.F. Hobbacher wrote in message said:
What is the best way for a long term archiving of scans?? Media,
formats, strategy??

Please give advice

Please define "long term".

There are several issues to consider. For example, how long does a DVD last
before the data deteriorates? What technology will be around in x years time
to retrieve the data when DVD is obsolete?

Past history suggests that paper is the most stable medium...

Mike
 
I don't think anyone has a really good answer for that. It's been a subject
for debate for many years. For large mainframes, the answer used to be reel
tape (first 7-, then 8-track), but even that has changed to various tape
cartridge formats now. Of course, even the large mainframes are becoming
extinct in many respects.

The problem is that the storage formats become obsolete. In 1980 we had
8-inch floppies (360 KB), followed by 5 1/4 inch in about 1982 (360 to 800
KB), then 3 1/2 inch (720 to 1440 KB). There is also a 2.88 MB 3 1/2"
format, but you rarely see it used. There have been and are other storage
formats around, such as ZIP drives, but they generally are not mainstream
enough to consider for persistent archiving.

As the new formats became available they were so much more convenient for
day-to-day use that the older formats fell into disuse, and then
unavailability, I doubt if you could find an 8-inch floppy drive or medium
for sale (new) any more, and if you could you would have to write drivers
for them. Even 5 1/4" floppies are essentially extinct, although I have
seen a few listed from time-to-time. The 3 1/2" is rapidly fading - many
new desktops and most laptops don't have them.

Currently most people are using CDs, and I suspect that they will persist
for at least a few years. Unfortunately, some of them to deteriorate
quickly, particularly the cheap ones. I've heard reports of CD disks
deteriorating in as little as a year, although I don't know the brands or
how they were stored (or even if it's true, I guess).

Some people are using DVDs, since they hold more, but they suffer from the
same deterioration as the CDs. Also, since there are competing formats for
DVDs, e.g. "+", "-", bilayer, etc., it isn't clear if there are DVD formats
that will outlast the CD.

I agree with Hecate that for safety, multiple formats are a good approach.
I also store mine on a separate hard drive, although I use a separate
internal drive rather than an external Firewire. I also keep copies on CDs
or DVDs, but am prepared to port them to a newer format as necessary.

BTW, I have some computer files that started out on audio cassette tape in
1977, then migrated to 5 1/4" floppies, 3 1/2" floppies, Travan tape, and
now on both hard drives and CDs. Somehow I skipped the 8" floppies :-)

I agree with Mike to some extent, also, that paper is a good medium in that
it will likely outlast any other format. After all, we each have the
compatible "drives" built into us. Unfortunately, it is more difficult to
copy without quality loss, and the colors are susceptible to degradation
over the long term if not carefully stored.

In short, keep multiple copies on multiple media formats, and be prepared to
port them every few years.

Don


message news:[email protected]...
 
The problem is that the storage formats become obsolete.

Indeed! I saw a TV program recently about a major library in the U.K.
which a few short years ago enthusiastically embraced the laser disk
only to find themselves in trouble because laser disk players are no
longer available.

The gist of the program was that we run a risk of the current era
leaving no history because our contemporary media is so transient. In
particular, even if the media itself survives there will be no
readers.
I agree with Mike to some extent, also, that paper is a good medium in that
it will likely outlast any other format. After all, we each have the
compatible "drives" built into us. Unfortunately, it is more difficult to
copy without quality loss, and the colors are susceptible to degradation
over the long term if not carefully stored.

A few years back there was a system using a so-called "2D bar code".
In essence, a bunch of tiny black and white squares. The data is
compressed, checksummed and printed out. Decoding is done by scanning
the sheet.

Even though this combines the best of both worlds - the durability of
paper and the absence of degradation of a digital format - it never
really caught on. Still, on a conceptual level, I liked the lateral
thinking of the approach.

(The other) Don.

P.S.
BTW, I have some computer files that started out on audio cassette tape in
1977, then migrated to 5 1/4" floppies, 3 1/2" floppies, Travan tape, and
now on both hard drives and CDs. Somehow I skipped the 8" floppies :-)

To further the identity confusion, this Don pretty much followed the
same path only I never used audio cassette tape but I did use 8"
floppies (at work).
 
A few years back there was a system using a so-called "2D bar code".
In essence, a bunch of tiny black and white squares. The data is
compressed, checksummed and printed out. Decoding is done by scanning
the sheet.

Even though this combines the best of both worlds - the durability of
paper and the absence of degradation of a digital format - it never
really caught on. Still, on a conceptual level, I liked the lateral
thinking of the approach.

This thread borders on the ridiculous. Durability of paper? Screw
durability if the paper is so voluminous that archiving it becomes
either unsearchable or unmanageable.

Archiving? this is not the Greeks attempting to build libraries that
last forever. Archiving is an ever on going process of moving data
from this to that to that ad infinitum.

The real problem with archiving is that fact that, in the end, few
really want to put the monetary resources in the budgets to have a
plan that survives time. It is a cost that is unpopular to politically
support.

IT professional X will die in Y. Will he be rewarded for spending,
being the proponent of an archiving scheme that he could care less
about when he goes belly up?
The Not So Fine Art Of Google - Go To Top Of Thread
http://makeashorterlink.com/?E29A321E6
 
This thread borders on the ridiculous. Durability of paper?

The elegance of lateral thought is obviously lost on you so let's just
focus on pragmatic aspects:

We *know* that paper lasts for hundreds (plural) of years, we *guess*
that assorted digital media is good for decades, at best. Even
gold/gold CDs are only "guestimated" at around 80 years.

As (the other) Don mentioned, we come equipped with built-in paper
"readers". We actually have two of them. And because we are visual
animals there will always be a way to convert (and therefore decode)
this paper into the digital domain. Not true with digital readers as
media changes constantly.

I can, literally, throw a piece of paper out of an airplane and it
will survive the fall. Try that with a hard disk.

I can fold a piece of paper and still be able to read it. Try that
with a CD.

I can even tear paper into many pieces and still be able to read it.
Try that with a floppy.

I can heat paper to 100 degrees Celsius and nothing will happen. Try
that with any digital media and it won't survive even half that.

I can expose paper to a variety of equipment (common in homes)
emitting magnetic pollution or spill assorted beverages (also common
in homes) over it and it will survive. Try that with conventional
digital media of your choice.

Etc, etc, etc...

Methinks, at least by comparison, that's pretty durable...

So why aren't we using paper (more) to archive digital data? Because
of relatively low data density of paper. But that's simply a question
of priorities and convenience and has nothing to do with durability.

Don.
 
Don said:
The elegance of lateral thought is obviously lost on you so let's just
focus on pragmatic aspects:

We *know* that paper lasts for hundreds (plural) of years, we *guess*
that assorted digital media is good for decades, at best. Even
gold/gold CDs are only "guestimated" at around 80 years.

As (the other) Don mentioned, we come equipped with built-in paper
"readers". We actually have two of them. And because we are visual
animals there will always be a way to convert (and therefore decode)
this paper into the digital domain. Not true with digital readers as
media changes constantly.

I can, literally, throw a piece of paper out of an airplane and it
will survive the fall. Try that with a hard disk.

I can fold a piece of paper and still be able to read it. Try that
with a CD.

I can even tear paper into many pieces and still be able to read it.
Try that with a floppy.

I can heat paper to 100 degrees Celsius and nothing will happen. Try
that with any digital media and it won't survive even half that.

I can expose paper to a variety of equipment (common in homes)
emitting magnetic pollution or spill assorted beverages (also common
in homes) over it and it will survive. Try that with conventional
digital media of your choice.

Etc, etc, etc...

Methinks, at least by comparison, that's pretty durable...

So why aren't we using paper (more) to archive digital data? Because
of relatively low data density of paper. But that's simply a question
of priorities and convenience and has nothing to do with durability.

Don.

Good points, All.

The problem with paper is storage.
Where do we (humans) keep all that paper? What about fire and water?

The more stuff written on paper the more problem we have finding the one
sheet of paper with the information that we are looking for.
 
Good points, All.

The problem with paper is storage.
Where do we (humans) keep all that paper? What about fire and water?

Environmental hazards destroy digital media too, so it's a draw at
best, although, as I pointed out, in many instances paper can take
more abuse.

The biggest problem is low data density of paper (even with
compression). So everyone, me included, continues using CDs and
whatever is around the corner...

Interestingly, even though computers were supposed to lead to a
"paperless office" it turns out they increased the usage of paper
exponentially. Human nature... If we make a typo on a typewriter we
use whiteout because we're not going to retype the whole page again.
If a computer printout is 0.00001 manometers off, we throw it away and
print another copy because it's so easy. Well, you know what I mean...
;o) And so we make more forests disappear...
The more stuff written on paper the more problem we have finding the one
sheet of paper with the information that we are looking for.

I have the same problem finding individual files on CDs... ;o)
But seriously, good indexing and filing applies equally to both.

Don.
 
I have the same problem finding individual files on CDs... ;o)
But seriously, good indexing and filing applies equally to both.
That's where searchable databases, etc. come in. For me, the best
solution is save it, database it, resave it when there's new
technology or every three years which ever comes sooner. I have all
my image CD's flagged, dated and in a database which tells me when
it's time to resave and so forth.
Of course, when I die no doubt someone will come along and say,
"what's all this crap?" and hit the off switch. However, until then...
;-)
 
Of course, when I die no doubt someone will come along and say,
"what's all this crap?" and hit the off switch. However, until then...
;-)

Bingo! I've been agonizing over scanning whatever's left of my slides
and photographs for over a year now trying to optimize and streamline
before I start. Of course, when I'm gone it will all be
unceremoniously trashed by someone...

Oh well, in the meantime it keeps me off the streets... ;o)

Don.
 
The elegance of lateral thought is obviously lost on you..

What exactly makes "lateral thought" elegant? Please define this
"lateral thought".
We *know* that paper lasts for hundreds (plural) of years, we *guess*
that assorted digital media is good for decades, at best. Even
gold/gold CDs are only "guestimated" at around 80 years.

As (the other) Don mentioned, we come equipped with built-in paper
"readers". We actually have two of them. And because we are visual
animals there will always be a way to convert (and therefore decode)
this paper into the digital domain. Not true with digital readers as
media changes constantly.

I have a simple PDF 417 barcoding compression routine that will read
and store the data in several different formats. The compression s
"elegant", the bar code is lateral", so maybe this is what the hell
you mean.
I can, literally, throw a piece of paper out of an airplane and it
will survive the fall.

How do you know? How high? What weather? What if it lands in the
Yangtze River? Do you always deal in ridiculous, untested and unproven
absolutes followed by meaningless cliche after cliche?
Try that with a hard disk.

Done it. Inside military grade laptops. They were fine.
I can fold a piece of paper and still be able to read it. Try that
with a CD.

I can read a CD in a CD reader. Try that with paper.
I can even tear paper into many pieces and still be able to read it.
Try that with a floppy.

I can drop a floppy in water and it won't smear. Try that with laser
output.
I can heat paper to 100 degrees Celsius and nothing will happen.

I HAVE laptops with our software in it in higher temps than that. Run
fine.
Try
that with any digital media and it won't survive even half that.

Bzzt. Wrong.
I can expose paper to a variety of equipment (common in homes)
emitting magnetic pollution or spill assorted beverages (also common
in homes) over it and it will survive.

Really? The paper may survive, maybe, but will the print? Not a
chance.

Bzzzt. Strike 2.
Try that with conventional
digital media of your choice.

Have. Done fine with it.
Etc, etc, etc...

Etc Etc Etc
Methinks, at least by comparison, that's pretty durable...

Methinks you need to methinks about this all over again.
So why aren't we using paper (more) to archive digital data? Because
of relatively low data density of paper.
Nope.

But that's simply a question
of priorities and convenience and has nothing to do with durability

Bzzzt. Strike 3, you're out.
 
Cruising Chrissy wrote in message ...
Bzzzt. Strike 3, you're out.

Come on, you know you're wrong. Paper and stone tablets have survived for
thousands of years through famine, plague and nuclear strikes. Even if
anyone does invent a digital medium that will last for 100 years there won't
be any machines around to read it (except in museums) or anyone that
remembers how to use them.

I'm not for one minute suggesting that paper is the way to go, simply
stating that there is not yet (and probably never will be) an input device
that can compete with the MKI human eyeball.

Mike
 
What exactly makes "lateral thought" elegant? Please define this
"lateral thought".

Viewing things not bound by conventional thought, thinking "sideways"
or, as the commonly overused phrase goes, "Thinking out of the box".

When this achieves a nominally unexpected but efficient solution it's
defined as "elegant". Similarly, for example in programming, a short
recursive routine is "elegant" when compared to a long jumbled mess of
an "unrolled" recursive routine.
I have a simple PDF 417 barcoding compression routine that will read
and store the data in several different formats. The compression s
"elegant", the bar code is lateral", so maybe this is what the hell
you mean.

Not file formats, *media* formats. The subject you raised was:
Durability of paper?

If you just read the paragraph you're responding to, it says so quite
clearly: "as media changes constantly".
I can read a CD in a CD reader. Try that with paper.

No offence, but I don't think you're really following and I don't have
the time to explain.

Let's just agree to disagree agreeably.

Don.
 
Paper and stone tablets have survived for
thousands of years through famine, plague and nuclear strikes.

You think that even today paper (skip stone) is the only archiving,
semi-pemanent route to take? You would be incorrect.
Even if
anyone does invent a digital medium that will last for 100 years there won't
be any machines around to read it (except in museums) or anyone that
remembers how to use them.

Bzzt. Strike two.
I'm not for one minute suggesting that paper is the way to go, simply
stating that there is not yet (and probably never will be) an input device
that can compete with the MKI human eyeball.

Bzzt. Strike Three. Two outs.
 
Viewing things not bound by conventional thought, thinking "sideways"
or lateral or, as the commonly overused phrase goes, "Thinking out of the box".

LOL

I live in an OTB world.
When this achieves a nominally unexpected but efficient solution it's
defined as "elegant". Similarly, for example in programming, a short
recursive routine is "elegant" when compared to a long jumbled mess of
an "unrolled" recursive routine.

Elegance is either in the function and/or the form.

Bzzt. Strike One said:
Not file formats, *media* formats. The subject you raised was:

PDF 417 can be printed on paper.
No offence, but I don't think you're really following and I don't have
the time to explain.

Oh, I am following a bunch of arcane, cliche ridden "data" which is
irrelevant to the world as it is today.
Let's just agree to disagree agreeably.

Why?

I would prefer to challenge you, and you me. Not disagreeable,
informative.
 
Viewing things not bound by conventional thought, thinking "sideways"
or, as the commonly overused phrase goes, "Thinking out of the box".

When this achieves a nominally unexpected but efficient solution it's
defined as "elegant". Similarly, for example in programming, a short
recursive routine is "elegant" when compared to a long jumbled mess of
an "unrolled" recursive routine.

But recursive routines have their limits too and are often over used.
First they teach the student how to write a recursive routine and then
they teach them when not to use them.

They fail miserably at what they do best when asked to do it too many
times.

They come with far too much baggage that has to be stored for each
recursion. Id liken it to going to the latest and greatest relational
database only to discover that legal requirements mean you still have
to keep all the signed original documents and the filing systems that
go with them.

With paper, languages change albeit over a longer period.
Later when some one tries to read your "Dead Sea Scrolls" they have to
apply their own interpretation.
Not file formats, *media* formats. The subject you raised was:

Will you be able to read any of the current media in the equipment
that will be used in several generations down the road. For example
will your great, great, grand children be able to view the photos of
their ancestors without having to find some obscure data retrieval
company to take those ancient CDs, or DVDs and view the photos and
store them on what ever the media of the day.

Actually there is probably a good chance of that with CDs as they are
so universally used, just as vinyl records. I have a turntable for
those. OTOH I seriously doubt that I'd be able to retrieve anything
off that stack of 8" floppies in the basement, even if the data is any
good. They were last used 24 years ago.

As it happens I do have a pair of 8" drives and the computer to run
them, BUT it's also been 24 years since that computer was last turned
on. Do you suppose the PROMS are still functioning properly?
If you just read the paragraph you're responding to, it says so quite
clearly: "as media changes constantly".
Each has it's own limitations. However at the current state-of-the-art
it's likely the photos will have lifetimes longer than the individual
digital media unless refreshed within the proper limits for the media
AND converted to modern media as it comes on the scene.

In this case I'd view the digital data no different than back-ups.
Back-ups are not considered archival. They are temporary and
refreshed on regular intervals.

Unless things have changed very recently, corporate America uses
optical storage for their archiving and those archives are *refreshed*
on a scheduled basis. Hence if need be the media can be changed as
technology changes.

Many of my data and image files have followed suit. However when I'm
gone will any one have any interest in preserving that data. If not
and several generations later some one develops an interest in the
"old days" and their ancestors the question becomes, will they be able
to read what ever media the data is on at that time.

With most digital media it is a question of whether future generations
will be able to read it, not whether the data will still be good. Most
likely the data will be good even if they can not read it.

Magnetic media is not considered archival because of the relative
short life of the data, not the media. The data can be refreshed and
used archivally, but it requires far more attention than optical as
far as the industry is concerned.

Backups used to be on the big tape drives, but that data had a
relatively short life. It tended to bleed through in the magnetic
sense. I don't think you will find many tapes much over 10 years old
that have none of the files corrupted. More likely most tapes will
have at least some that are corrupt.

Hard drives do not suffer from that problem but the last I heard the
lifetime for data was considered no more than 10 years.

Some one must have the URL for a site containing this information.
http://www.cd-info.com/CDIC/Technology/CD-R/Media/Longevity.html
is a good start, but the data is ...a bit dated. In particular, check
out the link to the Kodak media. However I'd like to reference a
remark made earlier in this thread alluding to the CDs being unable to
withstand much heat. The accelerated lifetime tests were run at
"100_degrees_C_" because there was so little degradation at 60 degrees
C. This was over 6 years ago! materials have improved since then.

Please note the projected data lifetime was OVER 100 YEARS!
TDK lists theirs as 70 years. Will there be any equipment around in
70 years to read either disk?

Roger Halstead (K8RI & ARRL life member)
(N833R, S# CD-2 Worlds oldest Debonair)
www.rogerhalstead.com
 
But recursive routines have their limits too and are often over used.

The point was merely to illustrate to the poster what "elegant" and
"lateral thought" meant in the given context. It was not meant as a
comprehensive treatise on recursive programming.
With paper, languages change albeit over a longer period.

The subject matter was media durability, not content.
Will you be able to read any of the current media in the equipment
that will be used in several generations down the road.

That's one of the reasons why I found the idea of using paper to store
digitally encoded data intellectually interesting - notwithstanding
paper's low data density.
In this case I'd view the digital data no different than back-ups.
Back-ups are not considered archival. They are temporary and
refreshed on regular intervals.

Well, the Subject is archiving.
However I'd like to reference a
remark made earlier in this thread alluding to the CDs being unable to
withstand much heat. The accelerated lifetime tests were run at
"100_degrees_C_" because there was so little degradation at 60 degrees
C. This was over 6 years ago! materials have improved since then.

Even though the temperature reference was not aimed specifically at
CDs, exposing CDs to such extremes will radically shorten their
lifetime. Since the subject is archiving, "radically shortened
lifetime" is functionally equivalent to "not surviving".

Paper's longevity also suffers when repeatedly exposed to higher
temperatures which is why I specifically chose 100 C.
Please note the projected data lifetime was OVER 100 YEARS!

The keyword being "projected" and because of that the figures change
all the time. Also, such estimates are very limited in scope. What
about all those audio CDs which "rusted" (oxidized) after only a few
years? How many other such unforeseen problems are there down the
road? Simple temperature aging test doesn't address any of that.

As I stated at the very beginning we *know* how long paper lasts, no
projection or guessing needed.

Of course, as I also stated, low data density doesn't make it a
feasible digital storage medium, although low data density is also a
strength because it makes it that much more resilient.

However, I found the thought of using a 2D bar code to store digital
data on paper interesting. I never expected that my sharing of this
curiosity would cause such a long thread... ;o)

Don.
 
The point was merely to illustrate to the poster what "elegant" and
"lateral thought" meant in the given context. It was not meant as a
comprehensive treatise on recursive programming.


The subject matter was media durability, not content.


That's one of the reasons why I found the idea of using paper to store
digitally encoded data intellectually interesting - notwithstanding
paper's low data density.


Well, the Subject is archiving.


Even though the temperature reference was not aimed specifically at
CDs, exposing CDs to such extremes will radically shorten their
lifetime. Since the subject is archiving, "radically shortened
lifetime" is functionally equivalent to "not surviving".

Paper's longevity also suffers when repeatedly exposed to higher
temperatures which is why I specifically chose 100 C.


The keyword being "projected" and because of that the figures change
all the time. Also, such estimates are very limited in scope. What
about all those audio CDs which "rusted" (oxidized) after only a few
years? How many other such unforeseen problems are there down the
road? Simple temperature aging test doesn't address any of that.

As I stated at the very beginning we *know* how long paper lasts, no
projection or guessing needed.

Of course, as I also stated, low data density doesn't make it a
feasible digital storage medium, although low data density is also a
strength because it makes it that much more resilient.

However, I found the thought of using a 2D bar code to store digital
data on paper interesting. I never expected that my sharing of this
curiosity would cause such a long thread... ;o)

Don.

I've thought along similar lines. Beehthoven's Fur Elise has lasted
centuries, through various media. It's a pattern, that transcends what
it's recorded on.

Apart from the massive diffence in scale, a 50 meg tiff (for example)
is just a pattern. We need a new, ultra compact, inherantly stable
media to record it on.
 
Apart from the massive diffence in scale, a 50 meg tiff (for example)
is just a pattern. We need a new, ultra compact, inherantly stable
media to record it on.

Agree about stable. But ultra compact? As storage devices get smaller,
they're just easier to lose! Easy enough to misplace digicam cards
already.

Sort of like keyboards that are too small for fingers.

Mac
 
Back
Top