J
Jonathan de Boyne Pollard
JdeBP> LEAN is relatively uncomplicated and thus simpler than
JdeBP> many mainstream filesystem formats (especially it is a lot
JdeBP> simpler than FAT) to implement a full-featured FSD for.
JdeBP> But because it is badly designed it isn't a format that I'd
JdeBP> recommend for actual *use*.
BDL> With out starting a flame war or the such, why is it that
BDL> you think LEAN is "badly designed"?
I've meant to do a detailed write-up on LEAN for a while, now, and
haven't had the time to get around to it. I still haven't. (-: So the
following is cursory.
Some of the design problems that make it a poor choice for actual use
are that the design hasn't learned from history. Henry Spencer's famous
saying applies to filesystem format design just as much as it does to
Unix. Those who don't learn from history are condemned to reinvent it
poorly. There are reasons that modern filesystem formats don't use
linked lists of clusters to record allocations, and use records of
extents instead. There are reasons that modern filesystem designs don't
put superblocks or similar frequently-used primary data structures at
fixed positions at the beginning of the volume, but place them in the
middle of the volume and allow them to be variably positioned. There
are reasons that modern filesystem formats use magic numbers for more
than just superblocks. There are reasons that modern i-node structures
have *four* timestamps, not three. There are reasons (ironically, a
path of deduction that has just been re-trod in this very news group)
that modern filesystem formats use trees, usually B-trees, for
directories, not unsorted linked lists. There are reasons that modern
filesystem formats don't use 32-bit timestamp fields. There are reasons
that modern filesystem formats do not require free space bitmaps to be
contiguous, but in fact encourage them to be discontiguous. There are
reasons that modern filesystem formats permit free space bitmaps to grow
and to shrink without refomatting the volume. The design of LEAN hasn't
learned from history in any of these areas.
Other poor design aspects of LEAN:
Much of what is in the superblock duplicates what is in the BIOS
Parameter Block[1]. That duplication is wholly unnecessary. The design
should let the BPB do its job. The design documentation is also mute on
the subject of BIOS parameter blocks. Modern filesystem formats should
at least specify what the filesystem type field of BPBs should be for
the format[2], and what BPB fields have defined meanings for the format.
LEAN has no provision at all for extended attributes. This is one
design aspect that doesn't merely make LEAN a poor choice for actual
use. It actually _eliminates it_ as a choice for actual use. For
practical use, extended attribute support is a requirement in a
filesystem format, and has been since at least 1990.
LEAN has no provision for several POSIX permission bits, such as a
"sticky" bit.
LEAN has no provision for several Windows file/directory attributes,
such as the "O" attribute.
LEAN has no provision for POSIX special files.
LEAN has no provision for ACLs.
LEAN disallows filename characters less than 0x20, whilst _not_
disallowing the slash character. This is a truly bizarre design choice
(and is a consequence of the poor choice of how to find free space in a
directory for a new entry by maltreating a linked list as if it were an
array). Directories should not be unsorted linked lists in the first
place, and the on-disc data structure should now disallow any characters
in names at all. (The names in directory entries are, after all, length
counted on disc. There's no reason to even require that NUL be treated
specially.) What characters are disallowed in names should be a
function of the operating system, not of the on-disc data structures.
The on-disc data structures should be neutral with respect to operating
systems.
LEAN specifies two distinct "." and ".." entries in a directory. These
are superfluous, even for recovery purposes. For recovery purposes, a
simple parent link suffices. There is no need to explictly have these
two entries in the on-disc data structures at all. Contrast the unified
"start" entry in HPFS directories.
Some minor points:
The efficiency analysis of traversing the cluster chain in the LEAN
documentation is wrong. It ignores the fact that whilst it is possible
to skip entire "indirect clusters", it is still necessary to _read_ them.
The analysis of undeletion in the LEAN documentation is, to be
charitable, grossly superficial. It glosses over the possibility that
the space used to store the file has been re-used for other purposes.
LEAN itself provides no simple means to detect that an "indirect
cluster" has been overwritten with, say, ordinary file data in the
meantime. It doesn't even provide a means for checking that the i-node
hasn't been overwritten. An undeletion tool thus has no way to reliably
determine whether it will be able to get the actual file back. (Such
considerations are one of the several reasons that modern filesystems
use magic numbers for more than just superblocks, as mentioned above.)
One more point:
The LEAN documentation says "It is said that programmers like the FAT
file system for its simplicity.". That's not said by people who have
actually listened to programmers. What is in fact more widely said, by
those that listen to programmers, is that programmers *despise* the FAT
filesystem format. It is complex, inconsistent, replete with bodges and
optional/variant features (requiring multiple code paths), and hard to
implement full-function filesystem drivers for. (Ironically, from the
point of view of a programmer, HPFS is a lot *easier* to implement a
full-function FSD for than FAT is.) Programmers have been despising FAT
for years. Crank up Google Groups, and you'll find programmers railing
against FAT since the 1980s.
[1]
<URL:http://homepages.tesco.net./~J.deBoynePollard/FGA/bios-parameter-block.html>
[2]
<URL:http://homepages.tesco.net./~J.deBoynePollard/FGA/determining-filesystem-type.html>
JdeBP> many mainstream filesystem formats (especially it is a lot
JdeBP> simpler than FAT) to implement a full-featured FSD for.
JdeBP> But because it is badly designed it isn't a format that I'd
JdeBP> recommend for actual *use*.
BDL> With out starting a flame war or the such, why is it that
BDL> you think LEAN is "badly designed"?
I've meant to do a detailed write-up on LEAN for a while, now, and
haven't had the time to get around to it. I still haven't. (-: So the
following is cursory.
Some of the design problems that make it a poor choice for actual use
are that the design hasn't learned from history. Henry Spencer's famous
saying applies to filesystem format design just as much as it does to
Unix. Those who don't learn from history are condemned to reinvent it
poorly. There are reasons that modern filesystem formats don't use
linked lists of clusters to record allocations, and use records of
extents instead. There are reasons that modern filesystem designs don't
put superblocks or similar frequently-used primary data structures at
fixed positions at the beginning of the volume, but place them in the
middle of the volume and allow them to be variably positioned. There
are reasons that modern filesystem formats use magic numbers for more
than just superblocks. There are reasons that modern i-node structures
have *four* timestamps, not three. There are reasons (ironically, a
path of deduction that has just been re-trod in this very news group)
that modern filesystem formats use trees, usually B-trees, for
directories, not unsorted linked lists. There are reasons that modern
filesystem formats don't use 32-bit timestamp fields. There are reasons
that modern filesystem formats do not require free space bitmaps to be
contiguous, but in fact encourage them to be discontiguous. There are
reasons that modern filesystem formats permit free space bitmaps to grow
and to shrink without refomatting the volume. The design of LEAN hasn't
learned from history in any of these areas.
Other poor design aspects of LEAN:
Much of what is in the superblock duplicates what is in the BIOS
Parameter Block[1]. That duplication is wholly unnecessary. The design
should let the BPB do its job. The design documentation is also mute on
the subject of BIOS parameter blocks. Modern filesystem formats should
at least specify what the filesystem type field of BPBs should be for
the format[2], and what BPB fields have defined meanings for the format.
LEAN has no provision at all for extended attributes. This is one
design aspect that doesn't merely make LEAN a poor choice for actual
use. It actually _eliminates it_ as a choice for actual use. For
practical use, extended attribute support is a requirement in a
filesystem format, and has been since at least 1990.
LEAN has no provision for several POSIX permission bits, such as a
"sticky" bit.
LEAN has no provision for several Windows file/directory attributes,
such as the "O" attribute.
LEAN has no provision for POSIX special files.
LEAN has no provision for ACLs.
LEAN disallows filename characters less than 0x20, whilst _not_
disallowing the slash character. This is a truly bizarre design choice
(and is a consequence of the poor choice of how to find free space in a
directory for a new entry by maltreating a linked list as if it were an
array). Directories should not be unsorted linked lists in the first
place, and the on-disc data structure should now disallow any characters
in names at all. (The names in directory entries are, after all, length
counted on disc. There's no reason to even require that NUL be treated
specially.) What characters are disallowed in names should be a
function of the operating system, not of the on-disc data structures.
The on-disc data structures should be neutral with respect to operating
systems.
LEAN specifies two distinct "." and ".." entries in a directory. These
are superfluous, even for recovery purposes. For recovery purposes, a
simple parent link suffices. There is no need to explictly have these
two entries in the on-disc data structures at all. Contrast the unified
"start" entry in HPFS directories.
Some minor points:
The efficiency analysis of traversing the cluster chain in the LEAN
documentation is wrong. It ignores the fact that whilst it is possible
to skip entire "indirect clusters", it is still necessary to _read_ them.
The analysis of undeletion in the LEAN documentation is, to be
charitable, grossly superficial. It glosses over the possibility that
the space used to store the file has been re-used for other purposes.
LEAN itself provides no simple means to detect that an "indirect
cluster" has been overwritten with, say, ordinary file data in the
meantime. It doesn't even provide a means for checking that the i-node
hasn't been overwritten. An undeletion tool thus has no way to reliably
determine whether it will be able to get the actual file back. (Such
considerations are one of the several reasons that modern filesystems
use magic numbers for more than just superblocks, as mentioned above.)
One more point:
The LEAN documentation says "It is said that programmers like the FAT
file system for its simplicity.". That's not said by people who have
actually listened to programmers. What is in fact more widely said, by
those that listen to programmers, is that programmers *despise* the FAT
filesystem format. It is complex, inconsistent, replete with bodges and
optional/variant features (requiring multiple code paths), and hard to
implement full-function filesystem drivers for. (Ironically, from the
point of view of a programmer, HPFS is a lot *easier* to implement a
full-function FSD for than FAT is.) Programmers have been despising FAT
for years. Crank up Google Groups, and you'll find programmers railing
against FAT since the 1980s.
[1]
<URL:http://homepages.tesco.net./~J.deBoynePollard/FGA/bios-parameter-block.html>
[2]
<URL:http://homepages.tesco.net./~J.deBoynePollard/FGA/determining-filesystem-type.html>