Using free space for more fault tolerant file systems.

  • Thread starter Thread starter Skybuck Flying
  • Start date Start date
S

Skybuck Flying

Hello,

Here is an idea which might have some value for some people:

The file system of for example Microsoft Windows could use free space to
store redundant data/copies.

The free space is not used anyway so it could be used to recover from bad
sectors more easily.

The free space remains available as free space but is secretly also used as
redundant data.

In the event of a bad sector perhaps the file system can recover from it
more easily.

Perhaps this technology idea is not needed for harddisks and current file
system since it seems already pretty stable, perhaps it already does it ?
But I don't think so.

However for new technologies like SSD which might be more error prone it
could be interesting idea to design a new file system or to modify the
existing file system so that it can take more adventage of free space for
more redundancy...

I have read about SSD spreading data across it's chip to prevent quick wear
of logic/the same sections, but does it also use free space for more
redundancy ? (I don't think so but I could be wrong).

So in case this is a new idea it could have some value, so I thought I'd
mention it.

Ofcourse users can also do this manually by making multiple copies of
folders and important data.

Perhaps the file system could also be extended with a "importancy tag".

Users could then "tag" certain folders as "highly important".

The more important the folder is the more redundancy it would get ;)

The system itself could have an importancy of 2, so that it can survive a
single bad sector.

Small little folders but with "super high importancy" could even receive a
redundancy of 4, 10, maybe even 100.

(Each redundancy means a duplicate copy, so 100 would mean 100 copies, 1
real copy, and 99 copies in free space).

Bye,
Skybuck.
 
Now that the main idea/concept has been described I shall also go into
somewhat obvious sub details:

1. First a word about performance, this probably shouldn't be to bad... the
system could make the redundant copies when the system is idling.

2. I also apologize to the comp.lang.c newsgroup for being off topic but
there are a lot of C programmes in there, perhaps some with file system
skills, or perhaps newbies who want to give it a try ! This is an important
topic so I think it warrants the off-topicness of it ;) Don't worry I won't
do it again any time soon probably, since good idea's like these come by
rarely ;)

And that's what I have to say about it for now.

Goodbye,
May god bless you ! ;) =D

Bye,
Bye,
Skybuck ! ;) =D
 
Skybuck Flying said:
Hello,

Here is an idea which might have some value for some people:

The file system of for example Microsoft Windows could use free space
to store redundant data/copies.

The free space is not used anyway so it could be used to recover from
bad sectors more easily.

The free space remains available as free space but is secretly also
used as redundant data.

In the event of a bad sector perhaps the file system can recover from
it more easily.

Bad sector correction these days is done within the hard drive, the OS
is not involved.

If the OS used "empty" parts of the disc for redundant data storage, it
would fill up. If there was one redundant copy of every sector, the
drive would hold only half as much.

People who to whom data integrity is that important are already using
RAID arrays. What you describe would save you in the event of a bad
sector, but wouldn't save you if the entire drive died. RAID can.

Although you also want off-line backups in a separate location so that a
fire can't wipe out the drive, the RAID array, and the backups all at
once.

-- Patrick
 
In said:
What you describe would save you in the event of a bad sector

It would only save you if there's enough free space to clone the
entire drive, or if the bad sector happened to be chosen for duplication
in the remaining free space.
 
John said:
It would only save you if there's enough free space to clone the
entire drive, or if the bad sector happened to be chosen for duplication
in the remaining free space.

I bet you could make use of the space.

Think "QuickPar". (This article doesn't do it justice, but you have to
start somewhere.)

http://en.wikipedia.org/wiki/Quickpar

QuickPar was proposed as a "belt and suspenders" method of storing
data on CDs. For example, you'd write 500MB worth of files, and
store an additional 200MB of parity blocks. If you took a nail, and scratched
200MB of data on the CD, the remaining parity blocks could be used to
re-constitute the original data.

The PAR method is typically used on USENET, in binary groups, for pirating
movies. A movie might be chopped up into a thousand USENET postings to a
binary group. USENET servers may have poor retention, or lose some of the
postings. If the user then injects a significant percentage of parity blocks,
recipients on the other end, downloading the movie, could download the available
eight hundred data blocks, two hundred or more parity blocks, and get the entire
movie to show up on their desktop (as the parity blocks can be used to
replace the missing data). So the concept was popularized in pirating
circles, and most of the testing would be done there (as to what works and
what doesn't work).

One of the problems with PAR, was the implementation was less than perfect.
There is a "maths" problem with the tools in popular circulation, such that
you can't always recover the data. There were some proposals on how to fix
that (some kind of sparse matrix of some sort), but I stopped following
the conversation on the subject. I did some testing, i.e. remove a block
of data, grab a parity block, and see the tool recover the data, so it
did work in very limited testing. But there are reports, from people
who have enough parity blocks but can't get the data back.

Paul
 
Think "QuickPar". (This article doesn't do it justice, but you have to
start somewhere.)

http://en.wikipedia.org/wiki/Quickpar
One of the problems with PAR, was the implementation was less than
perfect. There is a "maths" problem with the tools in popular
circulation, such that you can't always recover the data. There were
some proposals on how to fix that (some kind of sparse matrix of some
sort), but I stopped following the conversation on the subject. I did
some testing, i.e. remove a block of data, grab a parity block, and see
the tool recover the data, so it did work in very limited testing. But
there are reports, from people who have enough parity blocks but can't
get the data back.

The "maths problem" was in the first version of PAR (and in the academic
paper on which it was based), which has been deprecated for years. The
problem was fixed in PAR2, which QuickPAR uses.

Specifically: the algorithm relies upon generating an (n+m) x n matrix
(where n is the number of data blocks and m is the number of recovery
blocks) with the properties that the first n rows are an identity matrix,
and that any combination of n rows result in an invertible matrix. The
original algorithm didn't always satisfy the second constraint.

The fix is to start with a Vandermonde matrix (which inherently satisfies
the second constraint) and manipulate it using equivalence-maintaining
operations (swapping columns, multiplying a column by a scalar, adding a
multiple of another column) until the first constraint is satisfied.

Apart from the maths problem (which, in practice, meant that you might
occasionally need one or two blocks more should have been needed), there
were more significant limitations, the main ones being a limit of 255
blocks and the block size being equal to or larger than the largest file
in the set (i.e. each file was a single block).
The PAR method is typically used on USENET, in binary groups, for
pirating movies.

While the PAR/PAR2 implementation is mostly a usenet thing, the underlying
technology (Reed-Solomon error correction) is used far more widely:
CDs (audio and data), DSL, QR-codes, and RAID-6 all use it.
 
"Patrick Scheible" wrote in message
Skybuck Flying said:
Hello,

Here is an idea which might have some value for some people:

The file system of for example Microsoft Windows could use free space
to store redundant data/copies.

The free space is not used anyway so it could be used to recover from
bad sectors more easily.

The free space remains available as free space but is secretly also
used as redundant data.

In the event of a bad sector perhaps the file system can recover from
it more easily.

"
Bad sector correction these days is done within the hard drive, the OS
is not involved.
"

True, however from what I remember reading about that the harddisk only has
some extra space which is used to recover from bad sectors, so not the
entire drive/platters could be used ?

So there could still be some value in doing it in software ;)

"
If the OS used "empty" parts of the disc for redundant data storage, it
would fill up.
"

Not really, the redundant parts are tagged as "free space" but also tagged
as "redundant data" in case of emergency.

"
If there was one redundant copy of every sector, the
drive would hold only half as much.
"

Not all data is equal important, more important files could be made more
redundant.

The files which are least redundant could be "emptied" first...

I do know see a little problem with fragmentation, but perhaps that can be
solved too ;)

"
People who to whom data integrity is that important are already using
RAID arrays. What you describe would save you in the event of a bad
sector, but wouldn't save you if the entire drive died. RAID can.
"

True there are many other ways of adding redundancy, this perhaps new idea
could add to that, doesn't cost you a thing too probably except for some
extra software ;)

Though perhaps it would wear it the harddisk a little bit sooner (probably
insignificent for hd's), and perhaps SSD might also wear a lot faster (could
be an issue).

"
Although you also want off-line backups in a separate location so that a
fire can't wipe out the drive, the RAID array, and the backups all at
once.
"

Yeah, multiple places, even a data safe ! ;) :)

Bye,
Skybuck.
 
Skybuck Flying wrote (and cross-posted all over the map!):
"Patrick Scheible" wrote in message

That's an old idea from optical storage technology. See, FYI,
http://www.securdisc.net/eng/how-to-secure.html:

"Security problem #3:
How do I retrieve data from damaged discs?
I want to be able to retrieve my files if a disc is accidentally damaged.

SecurDisc Solution:
Data Reliability
After you've copied all your files onto a disc, SecurDisc uses the empty
space to add redundant and checksum data. This significantly increases
the chances of your files being retrieved, even if the disc itself is
damaged."


Seems appropriate for optical discs, but rather half-baked in regards to
most hard drives (performance and capacity and management issues). It
could maybe have potential for portable hard drives (but they'll all be
SSDs soon anyway) and what is on a portable drive that isn't duplicated
elsewhere? Indeed, many portable drives are the duplicate! It just
doesn't seem to make sense for hard drives, and just from the
aforementioned thoughts.
 
Interesting points they can be used to make the system even more
interesting.

When the system has problems reading certain sectors, the system itself
could mark those as "unreadable".

However care should be taken to make sure they remain "unreadable", strange
electrical problems could cause "temporarely unreadibility" perhaps.

So sometimes the system should try again, briefly, perhaps remark the sector
is a certain percentage of reliability.

This way by counting the number of bad sectors it becomes more possible to
indicate to the user that the harddisk is slowly failing.

Currently such features rely on "SMART" which might not be enabled on all
systems.

For example in my BIOS smart is default off ?!?

I'm not sure why it's off... maybe it can cause hardware problems, or maybe
it's not support by windows.

Whatever the case my be... a software-based system would work around any
hardware issues or incompatibilities and might make it more easy to report
pending-failures.

This gives the user some more time to remedy the problems. One unrecoverable
sector is already enough to cause major headaches... so in the advent of a
pending-failure every possible technology which could prevent a single
sector from failing would be definetly worth it.

And I totally agree with you, as soon as the disk starts failing it should
be replaced.

In no way is this system idea ment as a long term solution, it's mostly ment
to allow recovery of bad harddisks before it totally fails.

So this system idea gives more time to move data of the disk to something
else.

Bye,
Skybuck.
 
However another interesting point could be that the system could be used as
a sort of "second class" harddisk.

The harddisk is not used for anything important, it's just use to speed up
reading....

As long as the harddisk is working, there is no real reason to ditch it...
it could still be usefull for reading.

As long as the real data is backup/moved to a new harddisk by the user.

So this system idea could also make "unreliable disks" still somewhat
usefull ! ;)

Until it totally fails or causes hardware problems, like perhaps harddisk
intercommunication/bus/protocol problems or something.

^ Risky though... it might be difficult to diagnose until it's physically
disconnected, could also be rare electrical situation.

Bye,
Skybuck.
 
Not really,

I already thought of the optical store technology a long time ago.

Perhaps the product was based on that idea.

The harddisk idea is new, because a harddisk works differently at least
compared to "read only discs".

Therefore a harddisk based system can do much more than just a disc.

It's more of a dynamic idea.

Bye,
Skybuck.
 
Back
Top