Striping data across platters in a single hard disk

  • Thread starter Thread starter tony
  • Start date Start date
The little lost angel wrote:
Guys just like to do incomprehensible things. :P
And women don't ? I know someone who recently bought a /really/
ugly $400 purse from some brain-dead designer to replace a
beautiful locally made leather purse that probably cost one tenth
as much. All she could offer in the way of explanation was "I
just had to have a Gucci."
 
The little lost angel wrote:
Guys just like to do incomprehensible things. :P

And women don't ? I know someone who recently bought a /really/
ugly $400 purse from some brain-dead designer to replace a
beautiful locally made leather purse that probably cost one tenth
as much. All she could offer in the way of explanation was "I
just had to have a Gucci."

I'll risk a limb and say... that's cos she has bad taste!!! There are
some REALLY ugly LV/Gucci/name-your-fave-luxury-brand stuff that I
wouldn't carry even if you paid me to. But there are those that...
:PpPp
 
tony said:
Nah. There's a lot of those deployments in both SOHO
and residential arenas. Sometimes a "server peer" (not a
"real server running a server OS) acts as a "central peer"
(poor man's server) where server-like things happen (file
serving, backup of clients).

Sure, in the SOHO arena this (and gigabit) might make sense.
If there's enough admin talent available.
Not really. Just another node on the LAN accepting huge
streams of data from other nodes.

Just what do you think a [backup] server is? It's nothing
more than a PC. Usually dedicated, and often powerful in
particular ways.
Seek times on a desktop aren't really significant. That

You you have data for this assertion?
has more effect on transactional servers that are dishing
out records from databases.

And any server/PC that have more than one active disk task.
Throughput is what count's on desktops (loading huge graphic
files,

Graphics files might be stored contiguously if the user defrags.
backing up, booting in 10 seconds ...).

Backing up is seldom done image-wise, and is usually done
file/directory-wise. Lots of seeks. Booting the kernel is
fast, but it running start-up tasks (including network timeouts)
is slow and has lots of seeks for all the little files involved.

-- Robert
 
Robert Redelmeier said:
tony said:
Nah. There's a lot of those deployments in both SOHO
and residential arenas. Sometimes a "server peer" (not a
"real server running a server OS) acts as a "central peer"
(poor man's server) where server-like things happen (file
serving, backup of clients).

Sure, in the SOHO arena this (and gigabit) might make sense.
If there's enough admin talent available.
Not really. Just another node on the LAN accepting huge
streams of data from other nodes.

Just what do you think a [backup] server is? It's nothing
more than a PC. Usually dedicated, and often powerful in
particular ways.

Your point was that the server was some kind of transactional
beast in contention. A lot of these P2P LAN backup "servers"
are just another workstation and it doesn't necessarily have
to be one node that receives all the backups from all of the
other peers. To me, "server" means a lot more than just
this light duty type stuff (especially when it's just another
desktop in use by a user).
You you have data for this assertion?

Nope. It seems pretty obvious to me (unless you are going to
take it to the exteme and say it take seconds to seek). A
typical desktop machine doesn't see high transactional
loading like a database server would (obviously).
And any server/PC that have more than one active disk task.

Trivial, performance-wise. User won't see any difference.
Graphics files might be stored contiguously if the user defrags.

A maintained PC is assumed. If not maintained, then slow down
is a good warning sign for the user to get a clue or get some
help.
Backing up is seldom done image-wise,

Maybe where you live. Disk imaging is good protection for fat
clients (I wouldn't live without it).
and is usually done
file/directory-wise.

Data, yes. But data is only half of the story.
Lots of seeks.

Not significant, performance-wise in comparison to throughput
(no data, but you're welcome to perform your own tests and come
back with the data).
Booting the kernel is
fast, but it running start-up tasks (including network timeouts)
is slow and has lots of seeks for all the little files involved.

Throughput is most important on a desktop (and non-transactional
servers also). NCQ probably helps the highly loaded transactional
servers more than improved seek times (some seeks eliminated!). Many,
many servers are way overkill already as far as transactions go, but
can easily get bogged with the large hauling type workloads. TPC-C
benchmarks and the like makes for better headlines though. If you're
not Yahoo.com or similar, you're probably fine in that department.

Tony
 
Robert Redelmeier said:
Parallelizing heads could be done, but at considerable
cost, and only yield bandwidth gains that serious users
get via RAID anyways. It would not help (and likely hurt)
latency, which is at least as big a performance bottleneck.
Actually, about 10 years or so back, Seagate (if I remember correctly) used to make dual-active head
drives, which let them claim the highest head transfer rate at the time. However, there were two
problems at least. One was that the read/write channel electronics has to be doubled instead of
using a single-chip switch, so he cost was high. The second problem is in significant loss of
platter capacity - any defective track on one platter automatically meant that the good track on
another platter has to be discarded as well, and re-assigned. At the end, Seagate marketing was shy
to advertise the dual-head feature, and with next generation and higher density technology the idea
died from natural causes.

Cheers,
- aap
 
tony said:
beast in contention. A lot of these P2P LAN backup "servers"
are just another workstation and it doesn't necessarily

So long as that workstation is being used by someone else,
it _is_ in conetention.
Nope. It seems pretty obvious to me (unless you are going to
take it to the exteme and say it take seconds to seek). A

Actually, seeks are obviously significant: A disk takes
8.3 ms to make one revolution at 7200 rpm, and 6 ms at 10K.
Track-to-track seek times are around 2.0 ms, so seek is 20%
at least, even for huge unbroken files. On a random basis,
seek is ~12 ms, during which time a half MB could have been
transfered on your hypothetical 40 MB/s disk (that's sustained,
actual head rate is higher). How many reads are as big as
half MB? 4-8 KB is a more likely average, especially with
memory-mapped libs.

I believe disk mfrs have done the right thing, fast flexible
combs that minimize seek time rather than heavy parallel
combs that might maximize xfr rate.
typical desktop machine doesn't see high transactional
loading like a database server would (obviously).

That depends on what it is doing. If it is reading lots of
little bits and libraries from files, then it's no different
from TPM.
Maybe where you live. Disk imaging is good protection for
fat clients (I wouldn't live without it).

This sounds more like enterprise environment than SOHO.
Data, yes. But data is only half of the story.

Much less, volume-wise. Much more, value-wise. When I can
successfully back up configs, I won't back up OSes & apps.
Too easy to reinstall and purify.
NCQ probably helps the highly loaded transactional servers
more than improved seek times (some seeks eliminated!).

I don't know what NCQ will give on a single-drive channel
that a good OS isn't already doing: buffering requests in
an elevator.

-- Robert
 
Robert Redelmeier said:
So long as that workstation is being used by someone else,
it _is_ in conetention.

Not in the given scenario (nightly backup/imaging).
Actually, seeks are obviously significant: A disk takes
8.3 ms to make one revolution at 7200 rpm, and 6 ms at 10K.
Track-to-track seek times are around 2.0 ms, so seek is 20%
at least, even for huge unbroken files. On a random basis,
seek is ~12 ms, during which time a half MB could have been
transfered on your hypothetical 40 MB/s disk (that's sustained,
actual head rate is higher). How many reads are as big as
half MB? 4-8 KB is a more likely average, especially with
memory-mapped libs.

Or as in the given scenario, 200 MB disk images. If the disk
was overloaded for requests (database server) you'd have a
point. User using desktop PC or nightly backups, seek times
are nothing to worry about.
I believe disk mfrs have done the right thing, fast flexible
combs that minimize seek time rather than heavy parallel
combs that might maximize xfr rate.

The issue is throughput though. Can you halve the time to
transfer a file across the LAN by working on seek times?
(no).

And now folks, I'm done talking about seek times (LOL,
I go to the store to buy some eggs and someone wants me
to have cheese instead!).
That depends on what it is doing. If it is reading lots of
little bits and libraries from files, then it's no different
from TPM.

Off topic.
This sounds more like enterprise environment than SOHO.

Or home users. And I should have used SMB rather than SOHO
(both).
Much less, volume-wise. Much more, value-wise. When I can
successfully back up configs, I won't back up OSes & apps.
Too easy to reinstall and purify.

If a single PC home user loses his software configuration (corrupts
and must reinstall), you're looking at 8 hours or more of setup.
I know many people who don't have one ioda of data that is
"critical", but those people would cringe to have to resetup their
PC again (especially if they have to pay someone to do it).

No one was arguing the value of data backup. The statement I made
was that data is only half of the story. Your comment "easy to reinstall"
is pretty much a wrong statement. While it can be done, who wants
to spend hours doing it (and like I said, many users don't have the
knowledge to get connected to their ISP, setup email etc.)
I don't know what NCQ will give on a single-drive channel
that a good OS isn't already doing: buffering requests in
an elevator.

I'll refer you to the hard drive manufacturer sites to get the lowdown
on their promises for NCQ (sounds reasonable though that it should
provide some improvement).

All in all, I don't like going round and round on issues that are off
topic. "On topic" is only such that would allow a hard drive to utilize
the capability of GbE soon. If you have a concept, please do give
it as maybe someone will clue in a disk vendor (while I think they
"have a clue", sometimes I'm not so sure. It took many years to
"get a clue" on the multicore stuff too remember.) I'm just wondering
if hard drive performance can be more than it is, and quite easily
too. Perhaps the days of rotating magnetic media drives are numbered
already and it's not worth developing the technology further... I dunno!
(It never hurts to ask though!)

Tony
 
tony said:
Or as in the given scenario, 200 MB disk images.

Huh? I've got single datafiles much bigger than this!
More Like 200 GB, but most of that will be empty unless
people do multi-media.
If the disk was overloaded for requests (database server)
you'd have a point. User using desktop PC or nightly backups,
seek times are nothing to worry about.

For overnight backups, nothing much matters. I have downloaded
100 MB files over dialup by letting it run overnight.
Nothing much matters until crunch time.
The issue is throughput though. Can you halve the time to
transfer a file across the LAN by working on seek times?
(no).

Actually, sometimes. Things like NOATIME are very important
for newserver performance.
If a single PC home user loses his software configuration
(corrupts and must reinstall), you're looking at 8 hours or
more of setup.

When I ran MS-Windows at home, I never had to reinstall.
Of course, the systems took heavy maintenance. I just finished
a wipe&upgrade to Linux Slackware. It took one hour (mostly
unattended) including setup and I lost zero data. And it
was on two disks. MS-WinXP is on one CD, so I'd expect it
takes less. Of course, setting configs can take a long time
if you don't have notes or remember them.
I know many people who don't have one ioda of data that is
"critical", but those people would cringe to have to resetup
their PC again (especially if they have to pay someone [snip]
it can be done, who wants to spend hours doing it (and
like I said, many users don't have the knowledge to get

I thought we were talking about people with some skills.

-- Robert
 
Robert Redelmeier said:
Huh? I've got single datafiles much bigger than this!
More Like 200 GB, but most of that will be empty unless
people do multi-media.

Typo obviously.
For overnight backups, nothing much matters. I have downloaded
100 MB files over dialup by letting it run overnight.

You'd be able to send twice as many client images to a server
overnight if the drives were twice as fast.

But this dialog is silly. Are you arguing against faster drives
or what? Perhaps you'd like slower (less throughput) ones?
Actually, sometimes. Things like NOATIME are very important
for newserver performance.


When I ran MS-Windows at home, I never had to reinstall.
Of course, the systems took heavy maintenance. I just finished
a wipe&upgrade to Linux Slackware. It took one hour (mostly
unattended) including setup and I lost zero data. And it
was on two disks. MS-WinXP is on one CD, so I'd expect it
takes less. Of course, setting configs can take a long time
if you don't have notes or remember them.

Even if you have notes, it will take a long time. Of course the
initial OS load is fast (about 15 mins for WinXP on my machine).
But the typical user won't even have a slipstreamed XP or Office
disks and have to reapply all those patches. Ouch! that could take
hours by itself (might get lucky and be at the patch "rollup" time).
I know many people who don't have one ioda of data that is
"critical", but those people would cringe to have to resetup
their PC again (especially if they have to pay someone [snip]
it can be done, who wants to spend hours doing it (and
like I said, many users don't have the knowledge to get

I thought we were talking about people with some skills.

Why did you think that? Either way, the point was that backing up
only data is a job only half done, IMO.

Tony
 
Typo obviously.

One task which *would* benefit from faster transfer rate is rebuilding a
disk mirror - we do a hot swap once a week and the 45mins to rebuild 80GB
is a royal PITA.
You'd be able to send twice as many client images to a server
overnight if the drives were twice as fast.

But this dialog is silly. Are you arguing against faster drives
or what? Perhaps you'd like slower (less throughput) ones?


Even if you have notes, it will take a long time. Of course the
initial OS load is fast (about 15 mins for WinXP on my machine).
But the typical user won't even have a slipstreamed XP or Office
disks and have to reapply all those patches. Ouch! that could take
hours by itself (might get lucky and be at the patch "rollup" time).

Yes it's a lot more than an hour - I just did a Win2K Server install last
week and it was more than 8 hours by the time I got past all the usual
glitches. The most aggravating was the Windows AutoUpdate after applying
SP4: "... failed to download due to regulation"<GRRR>. Then there was all
the software, including Roxio EMC<grrr> and finally when I got SQL Server
2000 installed, it picked up the slammer worm (surprised that thing is
still in the wild) within an hour -- took me longer to figure what it was
-- and the flood killed my Internet connection.
 
tony said:
But this dialog is silly. Are you arguing against faster drives
or what? Perhaps you'd like slower (less throughput) ones?

No. I'm just saying that transfer rate isn't the only
performance factor, nor necessarily the most important.
Even if you have notes, it will take a long time. Of
course the initial OS load is fast (about 15 mins for
WinXP on my machine). But the typical user won't even
have a slipstreamed XP or Office disks and have to reapply
all those patches. Ouch! that could take hours by itself
(might get lucky and be at the patch "rollup" time).

I see your point, but Linux distros are upgraded far more
frequently (easy to catch a rollup), and after that there
may only be one or two patches to apply.

-- Robert
 
George Macdonald said:
One task which *would* benefit from faster transfer rate
is rebuilding a disk mirror - we do a hot swap once a week
and the 45mins to rebuild 80GB is a royal PITA.

Breaktime! ... but yes, this is a case where xfr matters
if it is done image-wise.
Yes it's a lot more than an hour - I just did a Win2K Server
install last week and it was more than 8 hours by the time I
got past all the usual glitches. The most aggravating was the
Windows AutoUpdate after applying SP4: "... failed to download
due to regulation"<GRRR>. Then there was all the software,
including Roxio EMC<grrr> and finally when I got SQL Server 2000
installed, it picked up the slammer worm (surprised that thing
is still in the wild) within an hour -- took me longer to figure
what it was -- and the flood killed my Internet connection.

This is horrible. It seems most of the time was in the patching.
Involving some "Catch-22". MS ought to be ashamed. Or at least
releasing all-patches-in install disks every quarter (month?).

-- Robert
 
George said:
Yes it's a lot more than an hour - I just did a Win2K Server install last
week and it was more than 8 hours by the time I got past all the usual
glitches. The most aggravating was the Windows AutoUpdate after applying
SP4: "... failed to download due to regulation"<GRRR>. Then there was all
the software, including Roxio EMC<grrr> and finally when I got SQL Server
2000 installed, it picked up the slammer worm (surprised that thing is
still in the wild) within an hour -- took me longer to figure what it was
-- and the flood killed my Internet connection.

Would you have been safer by rebuilding the system behind a firewall?
 
Robert Redelmeier said:
No. I'm just saying that transfer rate isn't the only
performance factor, nor necessarily the most important.

On the desktop (where the volume hard drive deploments are),
throughput is everything (especially now with GbE to the desktop
becoming ubiquitous).

Tony
 
Would you have been safer by rebuilding the system behind a firewall?

It *is* behind a firewall - not a particularly great one, i.e. just the
filtering rules in the Netopia router but I'd no idea that SQL Server was
such a stinking attractant for worms and other attacks.:-) No way I'd have
known to filter port 1434 before I found out about the bu... err,
vulnerability.

Oh it wasn't really a rebuild - it was a system which had been running
Win2K Pro but we needed to have an Intranet SQL Server so it was basically
a fresh install.
 
Breaktime! ... but yes, this is a case where xfr matters
if it is done image-wise.


This is horrible. It seems most of the time was in the patching.
Involving some "Catch-22". MS ought to be ashamed. Or at least
releasing all-patches-in install disks every quarter (month?).

This failure "due to regulation" has been pestering Win2K/XP systems since
last August. If you leave the AutoUpdate turned on long enough -- several
days -- it usually starts working. I believe it's due to several reasons
but primarily to prevent unauthorized installs from *benefiting" from the
updates... the innocent get caught too.:-)

The slammer infection was kinda funny really: this was evening so everyone
had gone and I took a glance at the network switch (24 port) in passing and
it was lit up like a Christmas tree with all its light blinking in sync. I
thought: *WTF*... and then it was "courbons le dos"... again!
 
George Macdonald said:
This failure "due to regulation" has been pestering Win2K/XP
systems since last August. If you leave the AutoUpdate turned
on long enough -- several days -- it usually starts working.
I believe it's due to several reasons but primarily to prevent
unauthorized installs from *benefiting" from the updates... the
innocent get caught too.:-)

I thought MS abandoned this spitful policy. Partly from the
false positives, and partly from the reputation consequences.
IIRC, a very large number of unauthroized installs and done
by others [vendors], and the user is unaware.
The slammer infection was kinda funny really: this was
evening so everyone had gone and I took a glance at the
network switch (24 port) in passing and it was lit up
like a Christmas tree with all its light blinking in
sync. I thought: *WTF*... and then it was "courbons le
dos"... again!

Interesting symptom. I'm a bit surprised the worm got
through your firewall. Even the cheapest block all inbound
TCP connects, ICMP and UDP traffic.

-- Robert
 
George Macdonald said:
This failure "due to regulation" has been pestering Win2K/XP
systems since last August. If you leave the AutoUpdate turned
on long enough -- several days -- it usually starts working.
I believe it's due to several reasons but primarily to prevent
unauthorized installs from *benefiting" from the updates... the
innocent get caught too.:-)

I thought MS abandoned this spitful policy. Partly from the
false positives, and partly from the reputation consequences.
IIRC, a very large number of unauthroized installs and done
by others [vendors], and the user is unaware.

Only a few months ago I had a Thinkpad which would not pass their
"HowToTell" (www.howtotell.com) -- couldn't even DL the ActiveX Control for
it -- and it would not allow the AutoUpdate to proceed. After a manual
update things seemed to start working again on AutoUpdate but it will stll
not DL the ActiveX at HowToTell. All the security settings are correct so
I dunno - there was a flakey user who took it home for a while so it's hard
to figure but it is a legal registered copy of WinXP. Anyway, it's
possible M$ has now relaxed things but failure "due to regulation" is still
quite common and reading the WindowsUpdate.log is just painful.
Interesting symptom. I'm a bit surprised the worm got
through your firewall. Even the cheapest block all inbound
TCP connects, ICMP and UDP traffic.

Well, on the infected computer, the worm just starts broadcasting so
everything gets hit on the LAN - it's been known to essentially shut down
large corporate LANs/WANs. Our LAN is not on NAT -- something I'm
considering changing -- so the firewall is set up for port blocking,
basically everything incoming <1024 gets blocked plus a few high ports.
The worm comes in on 1434 and if I'd known about SQL Server's multiple
vulnerabilities beforehand I'd have known to block that port too.
 
Only a few months ago I had a Thinkpad which would not pass their
"HowToTell" (www.howtotell.com) -- couldn't even DL the ActiveX Control for
it -- and it would not allow the AutoUpdate to proceed. After a manual
update things seemed to start working again on AutoUpdate but it will stll
not DL the ActiveX at HowToTell. All the security settings are correct so
I dunno - there was a flakey user who took it home for a while so it's hard
to figure but it is a legal registered copy of WinXP. Anyway, it's
possible M$ has now relaxed things but failure "due to regulation" is still
quite common and reading the WindowsUpdate.log is just painful.

It is not even as if this is stopping dodgy copies of XP getting
updates. Just disable the genuine advantage ActiveX control
(http://home19.inet.tele.dk/jys05000/) and use the site fine with your
bittorrent copy.
 
It is not even as if this is stopping dodgy copies of XP getting
updates. Just disable the genuine advantage ActiveX control
(http://home19.inet.tele.dk/jys05000/) and use the site fine with your
bittorrent copy.

<sigh> That procedure is for use of the Windows Update tool, not the
AutoUpdate. It also describes downloading the ActiveX Control, which the
machine in question could not do and then disabling it with Manage Add-ons,
which is a feature only available in more recent versions of IE 6.x...
which that system did not have at the time.

That system is also a company computer and is not umm, bittorrent enabled.
 
Back
Top