smartmontools wrecked my SATA RAID array

  • Thread starter Thread starter Tom Del Rosso
  • Start date Start date
T

Tom Del Rosso

This happened last year, but I want to ask a related question. Here's the
story and my theory of the cause:

I ran smartmontools with the -d sat option to test USB drives. I got the
letter wrong at the end of /dev/sdx but that's normally harmless. If that
happend when testing an internal SATA and I pointed it at the SATA RAID
drive, then it just says the drive doesn't support SMART.

But this time I was using the USB switch and I pointed it at the RAID drive
by mistake. The SATA RAID driver emulates SCSI, and USB drives use SCSI
commands, so the RAID driver got confused. I got an error message, but
maybe not the same one. Everything worked until I had to shutdown the next
day, after which it wouldn't boot. The BIOS wouldn't recognize either drive
alone or together. Obviously the RAID driver is buggy. The BIOS is as
well, since it wouldn't even allow me to go into setup with the drives
connected.

Yes it was backed up, but not that day. After trying some other things I
used Clonezilla to copy one RAID drive to a new drive. It booted as a
single drive. To my surprise the original booted as a single drive too. On
one hand it's good that CZ fixed it, but on the other I would rather it
didn't touch an original drive generally. There are a couple of CZ options
that seem related but I probably used them wrong, which was a good thing.

So my question is, how can you determine the device letter for /dev/sdx
without guessing? Maybe Sysinternals Winobj.exe but it gives numbers for
the drives. This would also be a great convenience when testing drives that
are connected temporarily.
 
Christian said:
You could report the USB ID of the device to smartmontools-database at
sourceforge.net or add it here:
http://sourceforge.net/apps/trac/smartmontools/wiki/Supported_USB-Devices

Then it will be added to smartmontools drive database and option -d
sat is no longer needed. This also reduces the risk that a buggy driver
accidentally receives a SAT command.

That's good. I usually don't run the latest though.

Which Motherboard and SATA RAID driver is this?

Asus A8N-SLI Premium. I don't know what driver version it was, as I'm not
using it any more.

So my question is, how can you determine the device letter for
/dev/sdx without guessing? Maybe Sysinternals Winobj.exe but it
gives numbers for the drives. This would also be a great
convenience when testing drives that are connected temporarily.

The pseudo device names [/dev/]sda, sdb, ..., and pd0, pd1, ... map to
Windows device names \\.\PhysicalDrive0, ...1, ... (see smartctl man
page).
You can see the assigned numbers in Windows disk management. Run
diskmgmt.msc from console to start it directly.

Thanks. I wasn't sure if the letters corresponded directly to the numbers.
But what about sda vs hda in the case of both PATA and SATA?
 
Tom Del Rosso said:
This happened last year, but I want to ask a related question. Here's the
story and my theory of the cause:
I ran smartmontools with the -d sat option to test USB drives. I got the
letter wrong at the end of /dev/sdx but that's normally harmless. If that
happend when testing an internal SATA and I pointed it at the SATA RAID
drive, then it just says the drive doesn't support SMART.
But this time I was using the USB switch and I pointed it at the RAID drive
by mistake. The SATA RAID driver emulates SCSI, and USB drives use SCSI
commands, so the RAID driver got confused. I got an error message, but
maybe not the same one. Everything worked until I had to shutdown the next
day, after which it wouldn't boot. The BIOS wouldn't recognize either drive
alone or together. Obviously the RAID driver is buggy. The BIOS is as
well, since it wouldn't even allow me to go into setup with the drives
connected.
Yes it was backed up, but not that day. After trying some other things I
used Clonezilla to copy one RAID drive to a new drive. It booted as a
single drive. To my surprise the original booted as a single drive too. On
one hand it's good that CZ fixed it, but on the other I would rather it
didn't touch an original drive generally. There are a couple of CZ options
that seem related but I probably used them wrong, which was a good thing.
So my question is, how can you determine the device letter for /dev/sdx
without guessing? Maybe Sysinternals Winobj.exe but it gives numbers for
the drives. This would also be a great convenience when testing drives that
are connected temporarily.

For something low-risk, use the -i command ("identify").
I do that. I have to say that the problem you observed should
only happen with a very broken RAID controller. There are
some out there, no doubt.

Arno
 
Arno said:
For something low-risk, use the -i command ("identify").

I tried "smartctl -i /dev/sda"

Short INQUIRY response, skip product id
A mandatory SMART command failed: exiting. To continue, add one or more '-T
permissive' options.

Now "smartctl -a /dev/sda" gives the same thing. What's going on?

Everest works on sda.

"smartctl -a -d sat /dev/sdd" works, but it can't read my SATA's now.


I do that. I have to say that the problem you observed should
only happen with a very broken RAID controller. There are
some out there, no doubt.

I always suspected that a company like NVidia wouldn't know jack about RAID.
 
I tried "smartctl -i /dev/sda"
Short INQUIRY response, skip product id
A mandatory SMART command failed: exiting. To continue, add one or more '-T
permissive' options.

That points to a very, very broken implementation indeed.
Usually (admittedly I use smartctl primarily on Linux),
-i even succeeds on devices without SMART support as
is uses veru basic ATA commands.
Now "smartctl -a /dev/sda" gives the same thing. What's going on?
Everest works on sda.
"smartctl -a -d sat /dev/sdd" works, but it can't read my SATA's now.
I always suspected that a company like NVidia wouldn't know jack
about RAID.

More like the do not know jack about ATA, which is worse.

Have you tried googeling this?

Arno
 
Arno said:
Have you tried googeling this?

It just happened. Smartctl worked a day or 2 ago.

I just installed the latest smartmontools, and now it works. And Everest
worked even when I was getting those error messages. On the other hand,
smartctl worked on the USB drive during the problem period. It's the same
mobo but I'm not using the RAID controller any more.
 
Christian Franke <Christian.Franke@t- said:
The pseudo device names [/dev/]sda, sdb, ..., and pd0, pd1, ... map to
Windows device names \\.\PhysicalDrive0, ...1, ... (see smartctl man page).

You can see the assigned numbers in Windows disk management.

wmic is handy for this too:

C:\>wmic diskdrive list brief

Caption DeviceID Model PartitionsSize
Hitachi HUA722020ALA330 \\.\PHYSICALDRIVE1 Hitachi HUA722020ALA330 1 2000396321280
M4-CT128M4SSD2 \\.\PHYSICALDRIVE0 M4-CT128M4SSD2 1 128034708480

there's an option to show which drive letter is mapped to which \\.\PhysicalDrive.
 
Mike said:
Christian Franke said:
The pseudo device names [/dev/]sda, sdb, ..., and pd0, pd1, ... map
to Windows device names \\.\PhysicalDrive0, ...1, ... (see smartctl
man page).

You can see the assigned numbers in Windows disk management.

wmic is handy for this too:

C:\>wmic diskdrive list brief

Caption DeviceID Model
PartitionsSize Hitachi HUA722020ALA330 \\.\PHYSICALDRIVE1 Hitachi
HUA722020ALA330 1 2000396321280 M4-CT128M4SSD2
\\.\PHYSICALDRIVE0 M4-CT128M4SSD2 1 128034708480

there's an option to show which drive letter is mapped to which
\\.\PhysicalDrive.


Ok, but what about hdx and sdx, especially when both PATA and SATA are
present?
 
Christian said:
hdX only exists for backward compatibility with early versions of
smartmontools (for Win9x/ME, which did not provide\\.\PhysicalDriveN).
With hdX, ATA protocol is assumed, sdX checks first for ATA or
SCSI/SAS protocol unless -d TYPE option is specified.

PATA and SATA transport do not make much difference from the
smartctl's point of view. Both use the same ATA commands and the same
pass-through I/O controls.
Some SATA specific logs like Phy Event Counters ("smartctl -l
sataphy") do not exist on PATA drives, of course.

I thought PATA drives are always specified by hdx. I'm pretty sure /dev/sdx
has always failed when I was looking at a PATA drive. Clonezilla also
refers to PATA drives as hdx and SATA as sdx.
 
Christian said:
Under Linux /dev/hdX is (AFAIK) used for the traditional IDE drivers.
These are typically used for older PATA controllers or for SATA
controllers set to IDE mode.
/dev/sdX is traditionally used for the SCSI stack which now also
includes the newer "libata" drivers which support AHCI etc.

If Clonezilla follows that convention then isn't it possible that older
versions of smartmontools did also? I usually keep a version for a long
time.
 
Back
Top