disk compression?

andrea catto' · Apr 30, 2004

guys, I want to understand how the transparent disk compression works,
I already know it works as a compression layer between the OS and the
cluster/sector access.

but I am confused,

even if a cluster or chunk or block can be shrunk into less than its
original size, there will be unused parts and the end in such a unit.

the compression would be effective only if those unused units are reutilized
somehow, but how ?
I don't get it.

please help me out

andrea catto' · May 2, 2004

thank you, I got it now, partially figured it out by myself after a lot of
speculation, and partially confirmed with your article.

since the msdos days I no longer followed low level disk management so I was
forgetting how clusters are managed as independent units that are
individually taken care of by the OS.

the OS, at least in the fat, has a representation of each cluster in form of
pointers within the FAT, so the number of clusters in a disk equals the
number of pointers (multiplied by 4/8 bytes each) in the fat.

baiscally, since 99% of the apps deal with files, not low level, they don't
know nor they care of how the files are distributed across the disk.
the OS normally already may jump from place A to B any time it needs to deal
with the next data chunk of a file.
to the file this is totally transparent, a file only deals with a (virtual)
progressive file pointer/position.
basically because of this abstraction layer the compression is a piece of
cake to implement.
instead of writing the data to a cluster as its demanded, the OS takes an
extra step and 'tries' compressing it, if it's reasonably smaller, it'll be
able to fit more per cluster-unit therefore reducing the odds of requiring a
new cluster-allocation from the OS.
that's way clear now.

I am hoping this explains the idea to others too since I like sharing.

1st highest level layer) the seamless and transparent file management, which
allows dealing with virtually sequential file pointers/access.
2nd layer) OS file management which decides when it's time to 'skip' to
another cluster-unit to allow the progressive file pointers seamlessely.
3rd layer) (IF COMPRESSION IS SET ONLY) try to compress as much as possible
before writing to fit more in a cluster-unit.
4th layer) (as suggested) cacheing.
5th layer) almost lowest level disk coordination regardless of the geometry
(which I recall it was int 25h,int 26h in dos)
6th layer) lowest level disk geometry coordination considering
head/disk/clusters etc... (which I recall it was int 13h in dos).

does this make sense to you all ?

disk compression?

andrea catto'

andrea catto'