Speeding up Windows with a SSD

  • Thread starter Thread starter Daniel Prince
  • Start date Start date
Hi!

David said:
This is, I think, the first time I have heard of someone "not trusting"
an SSD.

Hmmmm, I have to admit, I am seeing problems almost every day (or,
better to say, week). Actually, if we're talking about MLC, there's no
one SSD I can say I can trust. SLC is a bit different, and far more
expensive.
Do you have any evidence to back up fears of problems or lost
data,

Yes, a lot of them.
or are you just being conservative and paranoid about relatively
new technology (which is a fair enough attitude, of course)? I have
read of problems with slow-down over time on some SSDs, due to poor
garbage collection, but not of any data loses.

We have replaced a couple of them in server environment - I don't want
even to think about all the problems I've seen or heard about general
market.
As a general point, there is seldom any good reason for having more than
two disks in a raid 1 combination. You get vastly more data security
for your money by using two disks as raid1, and the other disk for
backup copies (preferably attached to a different machine). Then you
have protection from human error, software faults and file system
corruption rather than just protection from the unlikely event of having
two disk failures.

There's no serious implementation of SSD without RAID - and RAID 5 is
usual implementation for larger environments.
Raid1 write speed is limited by the slowest drive write speed - a write
is not completed until /all/ drives have finished writing (unless you
are using Linux mdadm raid with some extra options).

Depending on environment.
And most raid
controllers will not optimise accesses for mixes of drive types - they
assume the drives are fairly similar. So you won't get reads from the
SSD - you'll get a mixture.

There's another potential - Oracle's async implementation, and some
manufacturers (like LSI) are thinking and testing those scenarios.

With best regards,

Iggy
 
DevilsPGD said:
In message said:
As both cases happen in practice, the current best bet is to
have a mix of SSD and HDD, even if that is mildly inelegant.

Inelegant in the same fashion the L1, L2 and L3 caches are. It would
be more elegant if all of the RAM was on the same die as the
processor, but it's not Cost Efficient.

SSD's are still reasonably late comer into the mainstream and already
starting to be affordable enough to consider. 32 GB SSD is more
expensive than 32 GB USB thumb drive, this tells me that there is
still plenty of air in the prices even with current manufacturing
processes.

Few years down the road this discussion will be obsolete; Everyone
will have SSD as the primary drive for the reason that they will be
CHEAPER than HDD's up to a threshold. The rationale being that most
people buy what is offered to them. If the OEM's won't offer, say,
laptop's with spinning disks, you will have to see extra effort to
even get one to replace the SSD that the devices will ship with. Then
they will invent something else and we will buy it again, and so we
will keep consuming; happy end.
 
David Brown said:
This is, I think, the first time I have heard of someone "not trusting"
an SSD. Do you have any evidence to back up fears of problems or lost
data, or are you just being conservative and paranoid about relatively
new technology (which is a fair enough attitude, of course)? I have
read of problems with slow-down over time on some SSDs, due to poor
garbage collection, but not of any data loses.

Several reasons. One is certainly that it is new and unproven. A
second one is that Flash is basically unreliable. I torture-killed a
Kingston 2GB stick leat year and it failed after about 3700 complete
overwrites. That would not be so bad, but what really made me wary
was the failure mode: Wrong data on read, but absolutely no error
message. This despite error detection and correction functionality
in the used interface chip.

The point of this is that not only are affordable SSDs consumer grade,
which means cheapet possible flash and cheapest possible firmware
development, in addition, the developers themselves have only limited
experience with the technology and there are new failure modes.
For example, with an SSD I can get corryption in a file resulting
from a write to an entirely different file.

I will trust SSD when it either has several years obsercved usage
time with no significant problems, or the main failure modes
are well published and understood. Of course, the manufacturers
have no interest in killing their new cash-cow, so this will have to
come from other sources. Given how long it took to get this
type of data for HDDs, this may take a while.
As a general point, there is seldom any good reason for having more than
two disks in a raid 1 combination. You get vastly more data security
for your money by using two disks as raid1, and the other disk for
backup copies (preferably attached to a different machine). Then you
have protection from human error, software faults and file system
corruption rather than just protection from the unlikely event of having
two disk failures.

These disks are used for archival storage and I may not be able to
diagniose and fix a problem very fast, hence 3-way. It is akin to RAID6,
but for 3 disks, RAID1 is better than RAID6. If you look at the
probability of a RAID5 or RAID1 failing during resync for a replacement
disk, these numbers are pretty scary for modern disk sizes. And if
that happens, the data is gone. Disks are cheap, me having to manually
reconstruct the machine from backups is not.

Of course all critical data also has an automated, multi-generation
offsite backup.

I also have one of these disks drop out of the RAID about once per
year, without any discernable reason. Not a hardware issue that I
could find, happens with differenr MB/RAM/CPU/PSU and disks. They
re-add fine and then run without problem for another year. But when
such a drop happens, I loose RAID redundancy until the resync is
done. With 3-way RAID1 or RAID6, that risk is massively reduced
at very moderate one-time cost.
Raid1 write speed is limited by the slowest drive write speed - a write
is not completed until /all/ drives have finished writing (unless you
are using Linux mdadm raid with some extra options). And most raid
controllers will not optimise accesses for mixes of drive types - they
assume the drives are fairly similar. So you won't get reads from the
SSD - you'll get a mixture.

This is one of the areas were Linux software RAID shines.
The trick is to use the "read-mostly" option for the SSD on array
creation. Then you get full SSD read speeds from a mixed RAID 1
array. Please belive me that I looked at exactly this question
before. With "read-mostly", the other disks are only read during
consistency checks and if the SSD fails.

Write speeds are still HDD-level, quite correct, and I stated that.
For maildir data, an SSD is ideal - you are accessing a large number of
small files. I would drop the HDDs from this setup - use them for
backup. They'll only slow down the SSD.

They do not, see above.
Finally, maildir works fine for far larger boxes than 500 mails. Our
mailserver has many hundreds of thousands of mails in maildir format,
with some boxes being in the order of ten thousand emails. The trick is
to use a file system that is good at working with large directories, and
software that uses indexes (such as dovecot).

Any recomendations as to which filesystem? Ext3 with default options
definitely does not cut it. And I use mutt direct access, which also does
not cut it. With an SSD, its fine though. ;-)

Arno
 
hanukas said:
Inelegant in the same fashion the L1, L2 and L3 caches are. It would
be more elegant if all of the RAM was on the same die as the
processor, but it's not Cost Efficient.

Good comparison.
SSD's are still reasonably late comer into the mainstream and already
starting to be affordable enough to consider. 32 GB SSD is more
expensive than 32 GB USB thumb drive, this tells me that there is
still plenty of air in the prices even with current manufacturing
processes.
Few years down the road this discussion will be obsolete; Everyone
will have SSD as the primary drive for the reason that they will be
CHEAPER than HDD's up to a threshold. The rationale being that most
people buy what is offered to them. If the OEM's won't offer, say,
laptop's with spinning disks, you will have to see extra effort to
even get one to replace the SSD that the devices will ship with. Then
they will invent something else and we will buy it again, and so we
will keep consuming; happy end.

Quite possibly. Although I guess magnetic storage will be around
for a long-time yet. It is just dirt-cheap when you need lots
of space for movies, backups, and the like, were access is mostly
linear and size is what matters.

Actually SSD is more in the size-range of what you need
in faster storage than HDDs even today. If the price comes
down sufficiently, I can even imagine things like mainboards
coming with a 128GB or so SSD on-board as primary drive.
That would bring down cost even more.

Arno
 
David Brown said:
If the SSD is losing data, then it is either a bad disk, or a bad
manufacturer.

Or a new technology. You need about a decade of widespread
practical use to really iunderstand its characteristica and
failure mode. Often they are surprising, hence not part of
the model you have been using for the old technology that
gets replaced.

Historical example: Steam-engines. People though, "oh, great,
no more moody, unreliable horses or low-powered water-wheels."
All true, but these things can explode! And spew hot steam!
It took quite a while and many, many deaths to work this
kink out.

So until SSD has that decade, I will be wary. I will
probably start to trust it a bit more at the 5 year mark,
which will be something like 4.5 years in the future.
It is true that SSDs will wear out (though you would be /very/ hard
pushed to wear out a modern SLC drive in real life usage). In
particular, a very heavily written-to MLC drive will begin to wear out
after a few years.
But wear-out will not cause data loses, assuming the controller is
working correctly. It will simply lead to "bad blocks" that can no
longer be used, and thus less space on the drive. The controller will
mark the blocks as unreliable long before they become unreadable, and
will move the data to a different block.

If everything works. Remember this is new technology. The main
reason magnetic disks are so reliable is that they are in many
aspects only incremental improvements.
Failures do happen, but if you are getting a lot of disk failures, then
I would start looking for a new SSD supplier - they should be a lot more
reliable than HDDs.

"Should" is a risky assumption with new technology. Even the
manufacturers currently do not know how far they can cut cost
and retain reasonable quality. As for trust in reputable
manufacturers, I think there are enough examples within and
outside of the storage industry (BP comes to mind) that show this
is not a reliable criterium.

Arno
 
David Brown said:
On 19/11/2010 14:59, Arno wrote: [...]
"Should" is a risky assumption with new technology. Even the
manufacturers currently do not know how far they can cut cost
and retain reasonable quality. As for trust in reputable
manufacturers, I think there are enough examples within and
outside of the storage industry (BP comes to mind) that show this
is not a reliable criterium.
There are ways for manufacturers to get some reasonable ideas about the
reliability of components, though I agree that you can't be sure about
long-term reliability without long-term testing. For flash disks,
testing includes high-temperature environments and continuous
erase/write cycles on the same block as a way of learning about
endurance. What that gives you is points on a graph, that can be
interpolated and extrapolated to estimate real-world results.

I agree.
However, questions of data loss here is not really an issue with the
flash endurance or reliability - it's a software issue in the
controller. Either the controller correctly handles bad blocks when
they occur, or it does not - standard software testing and quality
control procedures should give the manufacturers a very good idea of the
software's reliability. Then it's just a matter of whether you trust
the manufacturer to do its job properly. Pick a manufacturer who aims
to sell SSDs to enterprise and high reliability customers - their
reputation will be so important to them that they will make the required
efforts even for non-enterprise drives.

I don't think they can produce high-quality SSD firmware
at this time, no matter the effort.
The BP example is very different. They did /not/ have a good reputation
- they were well known for taking risky shortcuts and putting profits
and costs before safety or environmental issues. And people at all
levels - from the lowest level worker on the rig to the top executives
were aware that corners were cut in the interests of time and money. BP
knew what they were doing, and knew it was risky. The big failure here
is that no one stopped them, or publicised the problems.

And here we have the issue. If the problem does not get publicised,
then their reputation is not an adequate reflectio of their
production quality standards.

Anyways, past experience shows that HDD manufacuters tend to lie
about problems with their products. That is why I do not trust
current consumer grade SSDs. Server grade SLC is a bit different,
but even they sometimes use consumer-grade controller chips, so
they are not that trustworthy either.

Arno
 
Mike Tomlinson said:
Have we not had that with flash memory (Compact Flash, Memory Stick, et
al?)

On the cell side technology side for SLC and 4 value MLC, yes.
On the flash-chip-market side (people putting in spot-market
chips and partially defective chips relying on ECC), on the
interface-chip side, on the >4 value MLC side and on the
firmware side, no.

Embedded SLC flash with IDE interface and no wear-leveling
I do trust to some extend. There are of course limits of
what you can expect, but I do trust these are now reasonably
well understood.

Still, maintaining the same level of distrust that current HDDs
deserve, _should_ be good enough to protect you from SSD
desasters. I just think they do not (yet) deserve a higher
level of trust.

Arno
 
USB flash sticks are very different from SSDs. They are not made with
anything like the levels of reliability you get from even the cheapest
SSDs. A flash stick is a low-usage device that is not made for primary
storage - it's for copying files around. It is expected that long
before you get wear-out issues, you will have either lost the stick or
replaced it with a bigger and cheaper model.

That sounds like "hopeful engineering" to me.
I agree that Raid 1 is a solid choice for a reliable drive set. Raid 6
is really just a bug-fix for Raid 5 because of the risks of a second
failure during a rebuild - you don't need that bug-fix for Raid 1.

Actually, you may. Run the numbers. For a best-case scenario, you
do not really. But if anything else goes wrong, and something already
has gone wrong when a disk falls out of a RAID so it is not best-case,
it may prove insufficient even for RAID1. I ran some concrete numbers
with students last semester, and they were scary even for RAID 1.
Keep in mind that you also have to model the time until the
RAID rebuild starts.

But design as you like. For me the added reliablility and lack of
time pressure in fixing a degraded array is well worth the really
low cost of an additional disk. And the set-up effort is incresed
basically by the effort to physically mount the additional disk.
If you are getting that sort of level of failures, then I can well see
the point of 3-way Raid 1. And if you are happy with the costs of the
drives, then 3-way Raid 1 is much better than Raid 6.
Agreed.
Yes, you can do exactly this with Linux mdadm raid - I had sort of
assumed you had a boring "standard" raid 1 setup!

Hehe, no. That would not have made a lot of sense, as you
rightfully pointed out.
I get the feeling that there was a thread on exactly this setup not too
long ago in this group.

I might have posted this here before.
You can in fact improve on that too, using a write bitmap, but then you
introduce a slight risk if there is a power-failure or other crash
during a write and a disk failure at the same time.

Quite possible. For my application "slow" writes are entirely fine
though.
A key issue is finding the files when there are lots of files in a
directory. You want a file system that has indexes and hashes.
Old-style ext3 does not keep such indexes - you need to tune it with the
dir_index option. ext4 always uses indexes, as do many other file
systems such as Reiserfs3, XFS and btrfs.
You also want to mount it with the noatime option (or at the very least,
relatime) to avoid timestamping on every file access. mutt has
apparently some issues with noatime, but it should work.
Probably btrfs is the fastest choice, but I doubt if you would consider
it as tried and tested technology :-)

Not at this time ;-)
Personally, I dislike the idea of a mail store on a client machine
anyway - I keep the mail on a server and use imap to access it. Then I
can let dovecot handle the file access, using its indexes.

This is sort-of a server. The entirely set-up is not too clean,
I am ready to admit that. I could also have gone back to old
mbox format, but wanted to see whether this newfangled SSD
thing could actually help without potentially compromising
reliability. Turns out for speeding up reads dramatically
with a relatively simple change, it really shines here.

Arno
 
Arno said:
I torture-killed a Kingston 2GB stick leat year
and it failed after about 3700 complete overwrites.

I think all flash memory is rated at 100,000 (or more) write cycles.
How many write cycles did Kingston rate their product at? If you
contacted Kingston, what did they say about the failure?
 
Mark F said:
On Tue, 23 Nov 2010 16:11:19 -0800, Daniel Prince

Multi-level stuff it nowhere near that level, closer to 10,000.
In fact, I think the newer stuff is 5,000, but perhaps this is for
3 bits per cell stuff.

In addition, most wear-leveling cause write-amplification (more
wrtes to the chips than are done to the devices) anywhere
in the 1.2 ... 5.0 range. You can buy SLC up to 1'000'000 cycles,
but the price is impressive.
For 10 to 20 years the expected number of cycles per cell went up,
but for the last 4 or years so the number of cycles seems to have
decreased for both 1 and 2 bits per cell chips (3 bits per cell
haven't been around long.)(There may be some trade offs, and perhaps
the RANGE of expected number of cycles has gone up, so with error
correction and wear leveling perhaps the average number of cycles per
cell has increased, but I haven't seen numbers directly from the
manufacturers that state that. (There some chance that the
manufacturers are requiring lower probability of failure in the
stated lifetimes, but I haven't seen anything that says this.)
Note that at least 3 things are being talked about:
. how many times any single cell can be cycled with very high
probability of it working
. the average number of times a cell can be cycled, with very
close to 0 probability that the average not reached for the chip as
a whole
. the average number of times things can be cycled and there is
still a very close to 0 probability that the data cannot be
recovered using ECC.

And what happens if ECC capabilities are exceeded. The really
worrisome bit in my test is that there was wrong data but no
error message. Had the same experience with a 1.5 year old
Knoppic image on a PQI stick: The data read varied like
crazy (on average 50% of the whole drive chager values between
reads) but no error message at all. It was also not a hardware
defect, because after one overwrite the stick was fine again
and is still in use, albait not trusted a lot.
For comparison, SanDisk gives a number for some of their products
which is the bytes written to a device. There is also a factor
for how long the data is retained after the last rewrite. SanDisk
uses 1 year retention (they don't state how long data is retained
early on in the live of cells.) A typical number for a SanDisk
high end product is given in:
http://www.sandisk.com/media/769368/SSD_G4_FINAL_Web.pdf
160TB/240GB, which is about 700 cycles. This should not be taken to
mean that an individual cell is only good for 700 cycles.

Hence the total overwrite test. The lack of hard data is
another thing that makes me mistrust SSDs at this time.

Also note that HDDs have 5 years data retention (and I have had
one PQI stick that indeed had lost all data after 1.5 years,
without being defect) and HDDs can be written as much as you
like. In specific situations, those 160TB can be spent in as
alittle as a year as well (data buffer, journalling device,
heavy swapping,...) It is an additional factor that has to be
taken into account and both write endirance and data retention
is nowhere near as good as HDDs provide. So it can lead to
failures, even if the vendors are hones and give you the
figures.

Arno
 
Back
Top