Re: Windows RAID



On 11/07/2011 07:05, Yousuf Khan wrote:
On 10/07/2011 6:46 PM, David Brown wrote:
On 10/07/11 22:30, Yousuf Khan wrote:
On 09/07/2011 8:47 AM, David Brown wrote:
On 08/07/11 23:24, Yousuf Khan wrote:
I don't think it's even too useful to experts, whether it's an odd
occasion or not. Unix experts find it useful because that's the way
things were done on Unix for years, and was necessary there. A lot of
repair on Unix is done manually after the basic automatic repairs
fail (like using alternative Superblocks to fix a badly munched
filesystem, requires human intelligence). On Windows that sort of
manual repair work isn't necessary, the utilities are a bit more
intelligent.


Again, I can't answer for UNIX, only for Linux. The days of manual
repair are long gone. Just as in the DOS/Windows world, there used
to be
a time when filesystems were simpler, disks smaller, and file formats
clearer - then manual disk repair or file recovery was practical.

To me, Linux and Unix are the same thing. Just get used to it, when I
say Unix, I'm also talking about Linux.

OK. Most of the time, it's fine to talk about *nix covering Unix, Linux,
the BSD's, and other related OS's. But given your experience with
Solaris (which really is UNIX), and your references to older limitations
of Unix, I thought you were making a distinction.

Most of the advanced software or hardware RAID setups that I've ever
seen were on Solaris & HP-UX systems, attached to SANs. Linux boxes were
mainly self-sufficient quick-setup boxes used for specific purposes.

Linux (and probably *nix :-) won't make repairs to a filesystem without
asking first, although it will happily tidy things up based on the
journal if there's been an unclean shutdown. And if it is not happy with
the root filesystem, then that means running in single-user mode when
doing the repair.

/Any/ filesystem check which does repairs is a bit disconcerting. But
when talking about "manual" or advanced repairs, I've been thinking
about things like specifying a different superblock for ext2/3/4, or
rebuilding reiserfs trees.

This is the sort of thing that Windows' own chkdsk handles without too
many questions. It's not to say some really bad filesystem problems
don't happen to NTFS that require extra attention, but somehow the
chkdsk can ask one or two questions at the start of the operation and go
with it from that point on. It may run for hours, but it does the
repairs on its own without any further input from you.

I'm not sure if it's because NTFS's design allows for simpler questions
to be asked by the repair utility, than for other types of filesystems.
Or if it's because the repair utility itself is just designed to not ask
you too many questions beyond an initial few.


I think it is perhaps just that Windows chkdsk will do the best it can without bothering the user, while the Linux utilities will sometimes ask the user as they /might/ know something of help. It's a difference of philosophy, not of filesystem design.

I have, on a W2K machine, had an NTFS filesystem that chkdsk found faulty but was unable to repair - but was happy to continue using. I never had any problems with using the partition, but chkdsk always reported faults.

I would suspect it's the latter, as Windows chkdsk only has two
filesystems to be geared for, NTFS or FAT. Whereas Unix fsck has to be
made generic enough to handle several dozen filesystems, and several
that can be added at some future point without warning. They usually
implement fsck as simply a frontend app for several filesystem-specific
background utilities.


fsck is just a front-end that identifies the filesystem, then calls fsck.ext3, fsck.xfs, fsck.reiserfs, etc., as appropriate. Most of these have few choices, but some (such as fsck.reiserfs) offer many options.

I haven't tried the Linux NTFS repair programs - I have only heard that
they have some limitations compared to the Windows one.

Well, I just haven't personally encountered any major issues with them
yet. I'll likely encounter something soon and totally change my mind
about it.

I think the over-complicated RAID schemes that emerged in software RAID
were as a result of performance and reliability problems with
server-controlled disks, i.e. JBODs. Hardware raid controllers could
dedicate themselves to monitoring every component inside them,
continuously. But a server has other work to do, so it can't dedicate
itself to constantly monitoring its disks. So a lot of issues could
arise inside a server that results in false diagnosis of a failure. So
they came up with complex schemes that would minimize downtime.


I don't quite agree here. There are different reasons for having
different RAID schemes, and there are advantages and disadvantages of
each. Certainly there are a few things that are useful with software
raid but not hardware raid, such as having a raid1 partition for
booting. But the ability to add or remove extra disks in a fast and
convenient way to improve the safety of a disk replacement is not an
unnecessary complication - it is a very useful feature that Linux
software raid can provide, and hardware raid systems cannot. And layered
raid setups are not over-complicated either - large scale systems
usually have layer raid, whether it be hardware, software, or a
combination.

It's not necessary, all forms of RAID (except RAID 0 striping), are
redundant by definition. Any disk should be replaceable whether it's
hardware or software RAID. And nowadays most are hot-swappable. In
software RAID, you usually have to bring up the software RAID manager
app, and go through various procedures to quiesce the failed drive to
remove it.


All forms of RAID (except RAID0) are redundant - when everything is working. But if you have one-disk redundancy, such as RAID5 or a two-disk RAID1, then you are vulnerable when a disk fails or is being replaced. The nature of a RAID5 rebuild is that you stress the disks a great deal during the rebuild - getting a second failure is a real risk, and will bring down the whole array. There are countless articles and statistics available online about the risks with large RAID5 arrays. The most obvious way to avoid that is to use RAID6 - then you still have one disk redundancy while another disk is being replaced. With software raid (at least on Linux), you can add extra redundancy temporarily when you know you are going to do a replacement (due to a drive that is failing but not dead, or a size upgrade).

Usually inside hardware RAID arrays, in the worst cases, you'd have to
bring up a hardware RAID manager app, and send a command to the disk
array to quiesce the failed drive. So it's not much different than a
software RAID. But in the best cases, in hardware RAID, all you have to
do is go into a front control panel on the array itself to quiesce the
failed drive, or even better there might be a stop button right beside
the failed drive right next to a blinking light telling you which drive
has failed.

When you replace the failed drive with a new drive, the same button
might be used to resync the new drive with the rest of its volume.


In the best case, with the best setup, replacing disks on a hardware array may be as easy as you say - take out the bad disk, put in a new one. If the system is configured to automatically treat the new disk as a spare and automatically rebuild, then it does the job straight away. In the worst case, you have to reboot your system to get to the raid bios setup (maybe the card's raid manager software doesn't run on your choice of operating system) and fix things.

With Linux mdadm, in the worst case you have to use a few mdadm commands to remove the failed drive from the array, add the new drive (hot plugging is fine), and resync. It's not hard. But if you want to have automatic no-brainer replacements, you can arrange for that too - you can set up scripts to automatically remove failed drives from the array, and to detect a new drive and automatically add it to the list of spares.

With Linux software raid, these sorts of things may involve a bit more learning, and a bit more trial and error (but you will want to do trial replacements of drives anyway, with hardware or software raid). But you can do as much or as little as you want - you are not constrained by whatever the hardware raid manufacturer thinks you should have.

Other advanced features of software raid are there if people want them,
or not if they don't want them. If you want to temporarily add an extra
disk and change your raid5 into a raid6, software raid lets you do that
in a fast and efficient manner with an asymmetrical layout. Hardware
raid requires a full rebuild to the standard raid6 layout - and another
rebuild to go back to raid5.

I see absolutely /no/ reason to suppose that software raid should be
less reliable, or show more false failures than hardware raid.

Well there are issues of i/o communications breakdowns, as well as
processors that are busy servicing other hardware interrupts or just
busy with general computing tasks. Something like that might be enough
for the software RAID to think a disk has gone offline and assume it's
bad.

Nonsense. If you are likely to get IO breakdowns or trouble from too many hardware interrupts, then you are going to get those regardless of whether your raid is hardware or software. It doesn't make any difference whether the disk controller is signalling the cpu to say it's read the data, or if it is the raid controller - you get mostly the same data transfers. You have some more transfers with software raid as the host must handle the raid parts, but the overhead is certainly not going to push an otherwise stable system over to failure.

And hardware raids have vastly more problems with falsely marking disks as offline due to delays - that's one of the reasons why many hardware raid card manufacturers specify special expensive "raid" harddisks instead of ordinary disks. The main difference with these is that a "raid" disk will give up reading a damaged sector after only a few seconds, while a "normal" disk will try a lot harder. So if a "normal" disk is having trouble with reading a sector, a hardware raid card will typically drop the whole drive as bad. Linux mdadm raid, on the other hand, will give you your data from the other drives while waiting, and if the drive recovers then it will continue to be used (the drive firmware will automatically re-locate the failing sector).

<http://www.smallnetbuilder.com/nas/nas-features/31202-should-you-use-tler-drives-in-your-raid-nas>
<http://en.wikipedia.org/wiki/Time-Limited_Error_Recovery>

It's getting less of a problem with multi-core processors, but
there are certain issues that can cause even all of the cores to break
down and give up, such as a Triple Fault. The computer just core dumps
and restarts at that point.


Have you ever seen, heard of or read of such an event being caused by a system accessing too many disks too quickly? Baring hardware faults or astoundingly bad design, no system is going to ask for data from disks faster than it can receive it!

There was a time when hardware raid meant much faster systems than
software raid, especially with raid5 - but those days are long past as
cpu power has increased much faster than disk throughput (especially
since software raid makes good use of multiple processors).

Not really, hardware raid arrays are still several orders of magnitude
faster than anything you can do inside a server. If this wasn't the
case, then companies like EMC wouldn't have any business. The storage
arrays they sell can service several dozen servers simultaneously over a
SAN. Internally, they have communication channels (often optical) that
are faster than any PCI-express bus and fully redundant. The redundancy
is used to both increase performance by load-balancing data over
multiple channels, and as fail-over. A busy server missing some of the
array's i/o interrupts won't result in the volume being falsely marked
as bad.


There are several reasons why hardware raid is still popular, and is sometimes the best choice. Speed, compared to a software raid solution, is not one of them.

First, consider small systems - up to say a dozen drives, connected to a card in the server. The most common reasons for using hardware raid are that that's what the system manufacturer provides and supports, that's what the system administrator knows about, and that the system will run Windows and has no choice. The system is slower and more expensive than software raid, but the system administrator either doesn't know any better, or he has no practical choice.

Then look at big systems. There you have external boxes connected to your server by iSCSI, Fibre Channel, etc. As far as the host server is concerned, these boxes are "hardware raid". But what's inside these boxes? There is often a lot of hardware to improve redundancy and speed, and to make life more convenient for the administrator (such as LCD panels). At the heart of the system, there are one or more chips running the raid system. Sometimes these are dedicated raid processors - "hardware raid". But more often than not they are general purpose processors running software raid - it's cheaper, faster, and more flexible. Sometimes they will be running a dedicated system, other times they will be running Linux (that's /always/ the case for low-end SAN/NAS boxes).

From the server administrator's viewpoint, he doesn't care what's inside the box - as long as he can put data in and get the data out, quickly and reliably, he is happy. So EMC (or whoever) make a big box that does exactly that. And inside that box is what can only be described as software raid.


This is absolutely not needed with hardware raid. The processors inside
hardware raid units are doing nothing else but monitoring disks. So it
made more sense for them to simplify the raid schemes and go for greater
throughput.


The processors inside hardware raid units simplify the raid schemes
because it's easier to make accelerated hardware for simple schemes, and
because the processors are wimps compared to the server's main cpu. The
server's cpu will have perhaps 4 cores running at several GHz each - the
raid card will be running at a few hundred MHz but with dedicated
hardware for doing raid5 and raid6 calculations.

I'm not talking about a RAID card, I'm talking about real storage arrays.

Yousuf Khan

.



Relevant Pages

  • Re: Windows RAID
    ... software RAID were as a result of performance and reliability ... and hardware raid systems cannot. ... Any disk should be replaceable ...
    (comp.sys.ibm.pc.hardware.storage)
  • Re: Windows RAID
    ... software RAID were as a result of performance and reliability ... and hardware raid systems cannot. ... Any disk should be replaceable ... Hardly ever with properly designed systems and drives bought with an eye to what is reliable. ...
    (comp.sys.ibm.pc.hardware.storage)
  • Re: Windows RAID
    ... disk redundancy while another disk is being replaced. ... you have to reboot your system to get to the raid ... have the danger about the server marking a disk bad in a hardware raid array, since the disk management is taken care of by the array's own processors. ... These days servers aren't really file servers anymore, you can get a gadget to do that. ...
    (comp.sys.ibm.pc.hardware.storage)
  • Re: Windows RAID
    ... disk redundancy while another disk is being replaced. ... you have to reboot your system to get to the raid ... or show more false failures than hardware raid. ... These days servers aren't really file ...
    (comp.sys.ibm.pc.hardware.storage)
  • slackware 9.1 software raid problem
    ... Setting up a RAID system with Slackware 8 is not extremely difficult once ... mirroring the root partition and booting from that mirror was not possible. ... Each disk is attached to a different IDE chain on the motherboard. ... The ability to boot from the Slackware 8 install CD. ...
    (alt.os.linux)