M
Mike Tomlinson
Having some trouble with Linux software RAID after an OS update, and
would be grateful for any insights.
Machine is an AMD 64-bit PC running 32-bit Linux. The machine was
previously running Fedora Core 4 with no problems. Two 500GB hard
drives were added to the onboard Promise controller and the Promise
section of the machine's BIOS configured for JBOD.
On boot, as expected, two new SCSI disk devices could be seen - sda and
sdb. These were partitioned using fdisk, a single partition occupying
the entire disk created, and the partition type set to 0xfd (Linux RAID
autodetect).
mdadm was used to create a RAID1 (mirror) using /dev/sda and /dev/sdb.
I can't remember for certain if I used the raw devices (/dev/sda) or the
partitions (/dev/sda1) to create the array, and my notes aren't clear.
The resulting RAID device, /dev/md0, had an ext3 filesystem created on
it and was mounted on a mount point. /etc/fstab was edited to mount
/dev/md0 on boot.
This arrangement worked well until recently, when the root partition on
the (separate) boot drive was trashed and Fedora Core 6 installed by
someone else, so I have only their version of events to go by. The
array did not reappear after FC6 was installed. The /etc/raidtab and/or
/dev/mdadm.conf files were not preserved, so I am working blind to
reassemble and remount the array.
Now things are confused. The way Linux software RAID works seems to
have changed in FC6. On boot, dmraid is run by rc.sysinit and discovers
the two members of the array OK and mounts it on
/dev/mapper/pdc_eejidjjag, where pdc_eejidjjag is the array's name:
[root@linuxbox root]# dmraid -r
/dev/sda: pdc, "pdc_eejidjjag", mirror, ok, 976562500 sectors, data@ 0
/dev/sdb: pdc, "pdc_eejidjjag", mirror, ok, 976562500 sectors, data@ 0
[root@linuxbox root]# dmraid -ay -v
INFO: Activating mirror RAID set "pdc_eejidjjag"
ERROR: dos: partition address past end of RAID device
[root@linuxbox root]# ls -l /dev/mapper/
total 0
crw------- 1 root root 10, 63 Jul 5 16:59 control
brw-rw---- 1 root disk 253, 0 Jul 6 03:11 pdc_eejidjjag
[root@linuxbox root]# fdisk -l /dev/mapper/pdc_eejidjjag
Disk /dev/mapper/pdc_eejidjjag: 500.0 GB, 500000000000 bytes
255 heads, 63 sectors/track, 60788 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device /dev/mapper/pdc_eejidjjag1
Boot
Start 1
End 60801
Blocks 488384001
Id fd
System Linux raid autodetect
I cannot mount /dev/mapper/pdc_eejidjjag1:
[root@linuxbox root]# mount -v -t auto /dev/mapper/pdc_eejidjjag1
/mnt/test
mount: you didn't specify a filesystem type for
/dev/mapper/pdc_eejidjjag1
I will try all types mentioned in /etc/filesystems or
/proc/filesystems
Trying hfsplus
mount: special device /dev/mapper/pdc_eejidjjag1 does not exist
'fdisk -l /dev/mapper/pdc_eejidjjag' shows that one partition of type
0xfd (Linux raid autodetect) is filling the disk. Surely this should be
type 0x83, since the device is the RAIDed disk as presented to the user?
And why does mount say the device /dev/mapper/pdc_eejidjjag1 does not
exist?
This may be due to my unfamiliarity with dmraid. I can find little
about it on the internet. I'm uncertain if it is meant to be used in
conjunction with mdadm, or whether it's either/or. In the past, Linux
software RAID has Just Worked for me using mdadm.
If I disregard dmraid, disabling the array with 'dmraid -an /dev/md0'
and use the more familiar mdadm instead, first checking with fdisk that
the disks have the correct RAID autodetect partitions:
[root@linuxbox root]# fdisk -l /dev/sda
Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sda1 1 60801 488384001 fd Linux raid
autodetect
[root@linuxbox root]# fdisk -l /dev/sda
Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sda1 1 60801 488384001 fd Linux raid
autodetect
then try to assemble the RAID with those, it fails:
[root@linuxbox root]# mdadm -v --assemble /dev/md0 /dev/sda1 /dev/sdb1
mdadm: looking for devices for /dev/md0
mdadm: cannot open device /dev/sda1: No such device or address
mdadm: /dev/sda1 has no superblock - assembly aborted
Perhaps I should be using the raw devices?
[root@linuxbox root]# mdadm -v --assemble /dev/md0 /dev/sda /dev/sdb
mdadm: looking for devices for /dev/md0
mdadm: /dev/sda is identified as a member of /dev/md0, slot 0.
mdadm: /dev/sdb is identified as a member of /dev/md0, slot 1.
mdadm: added /dev/sdb to /dev/md0 as 1
mdadm: added /dev/sda to /dev/md0 as 0
mdadm: /dev/md0 has been started with 2 drives.
[root@linuxbox root]# mdadm -E /dev/sda
/dev/sda:
Magic : a92b4efc
Version : 00.90.01
UUID : c4344083:a8d8cf32:3f00e0db:8765b21b
Creation Time : Thu Mar 22 15:26:52 2007
Raid Level : raid1
Device Size : 488386496 (465.76 GiB 500.11 GB)
Array Size : 488386496 (465.76 GiB 500.11 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 0
Update Time : Thu Jul 5 16:58:02 2007
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Checksum : 864ad759 - correct
Events : 0.4
Number Major Minor RaidDevice State
this 0 8 0 0 active sync /dev/sda
0 0 8 0 0 active sync /dev/sda
1 1 8 16 1 active sync /dev/sdb
[root@linuxbox root]# mdadm -E /dev/sdb
/dev/sdb:
Magic : a92b4efc
Version : 00.90.01
UUID : c4344083:a8d8cf32:3f00e0db:8765b21b
Creation Time : Thu Mar 22 15:26:52 2007
Raid Level : raid1
Device Size : 488386496 (465.76 GiB 500.11 GB)
Array Size : 488386496 (465.76 GiB 500.11 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 0
Update Time : Thu Jul 5 16:58:02 2007
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Checksum : 864ad76b - correct
Events : 0.4
Number Major Minor RaidDevice State
this 1 8 16 1 active sync /dev/sdb
0 0 8 0 0 active sync /dev/sda
1 1 8 16 1 active sync /dev/sdb
so that looks OK. Let's see what /dev/md0 looks like:
[root@linuxbox root]# fdisk -l /dev/md0
Disk /dev/md0: 500.1 GB, 500107771904 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/md0p1 1 60801 488384001 fd Linux raid
autodetect
That doesn't look right; I would have expected to see a partition of
type 0x83, since /dev/md0p1 is the RAID as presented to the user
according to fdisk. Trying to mount it anyway:
[root@linuxbox root]# mount -v -t auto /dev/md0 /mnt/test
mount: you didn't specify a filesystem type for /dev/md0
I will try all types mentioned in /etc/filesystems or
/proc/filesystems
Trying hfsplus
mount: you must specify the filesystem type
[root@linuxbox root]# mount -v -t auto /dev/md0p1 /mnt/test
mount: you didn't specify a filesystem type for /dev/md0p1
I will try all types mentioned in /etc/filesystems or
/proc/filesystems
Trying hfsplus
mount: special device /dev/md0p1 does not exist
mdadm --examine /dev/sd* shows both members of the array as correct,
with the same serial number. "cat /proc/mdstat" shows the array as
complete and OK with two members as expected.
/proc/mdstat shows:
[root@linuxbox root]# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sda[0] sdb[1]
488386496 blocks [2/2] [UU]
unused devices: <none>
I'm confused. I can't find much information on dmraid; the man page
seems to imply that it's for use with hardware RAID controllers, and I
don't know if I should be using that or mdadm, or both. Previously I
just used mdadm and everything Just Worked.
I don't know why assembling and starting the array doesn't present the
contents of the md device as expected, and why fdisk shows special
devices in /dev which the mount command says don't exist.
The user of the machine is getting worried as there's a lot of data on
this array, and of course, he has no backup.
I'm at the point of taking the disks out and trying them in a machine
running FC4. Any ideas or suggestions please before I do that?
would be grateful for any insights.
Machine is an AMD 64-bit PC running 32-bit Linux. The machine was
previously running Fedora Core 4 with no problems. Two 500GB hard
drives were added to the onboard Promise controller and the Promise
section of the machine's BIOS configured for JBOD.
On boot, as expected, two new SCSI disk devices could be seen - sda and
sdb. These were partitioned using fdisk, a single partition occupying
the entire disk created, and the partition type set to 0xfd (Linux RAID
autodetect).
mdadm was used to create a RAID1 (mirror) using /dev/sda and /dev/sdb.
I can't remember for certain if I used the raw devices (/dev/sda) or the
partitions (/dev/sda1) to create the array, and my notes aren't clear.
The resulting RAID device, /dev/md0, had an ext3 filesystem created on
it and was mounted on a mount point. /etc/fstab was edited to mount
/dev/md0 on boot.
This arrangement worked well until recently, when the root partition on
the (separate) boot drive was trashed and Fedora Core 6 installed by
someone else, so I have only their version of events to go by. The
array did not reappear after FC6 was installed. The /etc/raidtab and/or
/dev/mdadm.conf files were not preserved, so I am working blind to
reassemble and remount the array.
Now things are confused. The way Linux software RAID works seems to
have changed in FC6. On boot, dmraid is run by rc.sysinit and discovers
the two members of the array OK and mounts it on
/dev/mapper/pdc_eejidjjag, where pdc_eejidjjag is the array's name:
[root@linuxbox root]# dmraid -r
/dev/sda: pdc, "pdc_eejidjjag", mirror, ok, 976562500 sectors, data@ 0
/dev/sdb: pdc, "pdc_eejidjjag", mirror, ok, 976562500 sectors, data@ 0
[root@linuxbox root]# dmraid -ay -v
INFO: Activating mirror RAID set "pdc_eejidjjag"
ERROR: dos: partition address past end of RAID device
[root@linuxbox root]# ls -l /dev/mapper/
total 0
crw------- 1 root root 10, 63 Jul 5 16:59 control
brw-rw---- 1 root disk 253, 0 Jul 6 03:11 pdc_eejidjjag
[root@linuxbox root]# fdisk -l /dev/mapper/pdc_eejidjjag
Disk /dev/mapper/pdc_eejidjjag: 500.0 GB, 500000000000 bytes
255 heads, 63 sectors/track, 60788 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device /dev/mapper/pdc_eejidjjag1
Boot
Start 1
End 60801
Blocks 488384001
Id fd
System Linux raid autodetect
I cannot mount /dev/mapper/pdc_eejidjjag1:
[root@linuxbox root]# mount -v -t auto /dev/mapper/pdc_eejidjjag1
/mnt/test
mount: you didn't specify a filesystem type for
/dev/mapper/pdc_eejidjjag1
I will try all types mentioned in /etc/filesystems or
/proc/filesystems
Trying hfsplus
mount: special device /dev/mapper/pdc_eejidjjag1 does not exist
'fdisk -l /dev/mapper/pdc_eejidjjag' shows that one partition of type
0xfd (Linux raid autodetect) is filling the disk. Surely this should be
type 0x83, since the device is the RAIDed disk as presented to the user?
And why does mount say the device /dev/mapper/pdc_eejidjjag1 does not
exist?
This may be due to my unfamiliarity with dmraid. I can find little
about it on the internet. I'm uncertain if it is meant to be used in
conjunction with mdadm, or whether it's either/or. In the past, Linux
software RAID has Just Worked for me using mdadm.
If I disregard dmraid, disabling the array with 'dmraid -an /dev/md0'
and use the more familiar mdadm instead, first checking with fdisk that
the disks have the correct RAID autodetect partitions:
[root@linuxbox root]# fdisk -l /dev/sda
Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sda1 1 60801 488384001 fd Linux raid
autodetect
[root@linuxbox root]# fdisk -l /dev/sda
Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sda1 1 60801 488384001 fd Linux raid
autodetect
then try to assemble the RAID with those, it fails:
[root@linuxbox root]# mdadm -v --assemble /dev/md0 /dev/sda1 /dev/sdb1
mdadm: looking for devices for /dev/md0
mdadm: cannot open device /dev/sda1: No such device or address
mdadm: /dev/sda1 has no superblock - assembly aborted
Perhaps I should be using the raw devices?
[root@linuxbox root]# mdadm -v --assemble /dev/md0 /dev/sda /dev/sdb
mdadm: looking for devices for /dev/md0
mdadm: /dev/sda is identified as a member of /dev/md0, slot 0.
mdadm: /dev/sdb is identified as a member of /dev/md0, slot 1.
mdadm: added /dev/sdb to /dev/md0 as 1
mdadm: added /dev/sda to /dev/md0 as 0
mdadm: /dev/md0 has been started with 2 drives.
[root@linuxbox root]# mdadm -E /dev/sda
/dev/sda:
Magic : a92b4efc
Version : 00.90.01
UUID : c4344083:a8d8cf32:3f00e0db:8765b21b
Creation Time : Thu Mar 22 15:26:52 2007
Raid Level : raid1
Device Size : 488386496 (465.76 GiB 500.11 GB)
Array Size : 488386496 (465.76 GiB 500.11 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 0
Update Time : Thu Jul 5 16:58:02 2007
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Checksum : 864ad759 - correct
Events : 0.4
Number Major Minor RaidDevice State
this 0 8 0 0 active sync /dev/sda
0 0 8 0 0 active sync /dev/sda
1 1 8 16 1 active sync /dev/sdb
[root@linuxbox root]# mdadm -E /dev/sdb
/dev/sdb:
Magic : a92b4efc
Version : 00.90.01
UUID : c4344083:a8d8cf32:3f00e0db:8765b21b
Creation Time : Thu Mar 22 15:26:52 2007
Raid Level : raid1
Device Size : 488386496 (465.76 GiB 500.11 GB)
Array Size : 488386496 (465.76 GiB 500.11 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 0
Update Time : Thu Jul 5 16:58:02 2007
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Checksum : 864ad76b - correct
Events : 0.4
Number Major Minor RaidDevice State
this 1 8 16 1 active sync /dev/sdb
0 0 8 0 0 active sync /dev/sda
1 1 8 16 1 active sync /dev/sdb
so that looks OK. Let's see what /dev/md0 looks like:
[root@linuxbox root]# fdisk -l /dev/md0
Disk /dev/md0: 500.1 GB, 500107771904 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/md0p1 1 60801 488384001 fd Linux raid
autodetect
That doesn't look right; I would have expected to see a partition of
type 0x83, since /dev/md0p1 is the RAID as presented to the user
according to fdisk. Trying to mount it anyway:
[root@linuxbox root]# mount -v -t auto /dev/md0 /mnt/test
mount: you didn't specify a filesystem type for /dev/md0
I will try all types mentioned in /etc/filesystems or
/proc/filesystems
Trying hfsplus
mount: you must specify the filesystem type
[root@linuxbox root]# mount -v -t auto /dev/md0p1 /mnt/test
mount: you didn't specify a filesystem type for /dev/md0p1
I will try all types mentioned in /etc/filesystems or
/proc/filesystems
Trying hfsplus
mount: special device /dev/md0p1 does not exist
mdadm --examine /dev/sd* shows both members of the array as correct,
with the same serial number. "cat /proc/mdstat" shows the array as
complete and OK with two members as expected.
/proc/mdstat shows:
[root@linuxbox root]# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sda[0] sdb[1]
488386496 blocks [2/2] [UU]
unused devices: <none>
I'm confused. I can't find much information on dmraid; the man page
seems to imply that it's for use with hardware RAID controllers, and I
don't know if I should be using that or mdadm, or both. Previously I
just used mdadm and everything Just Worked.
I don't know why assembling and starting the array doesn't present the
contents of the md device as expected, and why fdisk shows special
devices in /dev which the mount command says don't exist.
The user of the machine is getting worried as there's a lot of data on
this array, and of course, he has no backup.
I'm at the point of taking the disks out and trying them in a machine
running FC4. Any ideas or suggestions please before I do that?