Re: building on your own a large data storage ...



In comp.sys.ibm.pc.hardware.storage Andre Majorel <cheney@xxxxxxxxxxxxxxx> wrote:
["Followup-To:" header set to comp.sys.ibm.pc.hardware.storage.]
On 2007-07-04, lbrtchx@xxxxxxxxxxx <lbrtchx@xxxxxxxxxxx> wrote:

I need to store a really large number of texts and I (could) have a
number of ATA100 S.M.A.R.T.-compliant hard drives, which I would like
to use to somehow build a large and safe (RAID-5?) data store
~
* I will have to use standard (and commercially available (meaning
cheap ;-))) x86-based hardware and open source software
~
* AFAIK you could maximally use 4 hard drives in such boxes

On a motherboard with 2 IDE ports, you cannot make a 4-disk
RAID-5 array because doing I/O on two devices on the same IDE
port gives poor performance.

Well, you can, but expect no more than, say, 10MB/s.

You could make two RAID-1 arrays each having one disk on the
primary IDE port and one on the secondary IDE port. Performance
will still suck when you do I/O on both arrays at the same time
but when one array is idle, the other will work OK.

This is of course not as good as RAID-5 from a disk space/euro
POV.

I added a Promise IDE PCI (Ultra 100, I believe) controller and used
one disk per IDE channel. That works fine.

Should I got for ATA or SATA drives and why?

SATA is better because 1) it doesn't have the master/slave
issues of IDE, i.e. if you have 4 SATA ports on your
motherboard, you *can* do a 4-disk RAID-5 array and 2)
motherboards with 8 SATA ports are easy to find.

And the cables are better and you can get 4 port and 8 port
SATA controller cards.

* heat dissipation could become a problem with so many hard drives

I would not want to do it without adequate ventilation.

Airflow from the outside to each disk is needed, unless
you do a very careful cooling design.

* I need a reliable and stable power supply

Fortron FSP-400-60GLN works for me. We have had issues with
Antec.

Fortron, Antec, both not the best. I had a 500W Fortron die on me with
60% of the maximum load after a power outage, which 24 El-Cheapo PCs
on the same power-rail survived. I recommend Enermax. Very well
engineered and with good reserves.

People in the know use software based RAID. Could you give me links
to these kinds of discussions?

The archives of the linux-raid mailing list (the administration
tool is called mdadm).

There is also a HOWTO, I believe. Anyways Linux software RAID is very
reliable and relatively easy to administrate (if you know what you are
doing). The HOWTO is a bit outdated, to you may also need the
mdadm man-page, but it will basically tell you most things you need:

http://tldp.org/HOWTO/Software-RAID-HOWTO.html

What would be my weak/hotspot points in my kind of design?

For me, the time was spent on
- understanding mdadm,
- understanding the trade-offs (partioning an array of disks vs.
making an array of partitions, using LVM or not, optimum
granularity) and

I strongly suggest using partitions of type 0xfd, because then the
kernel will auto-assemble the array on system start. For complete
disks you need some start-script or other, which also means you cannot
have the root-partition on the array. In additions these start-scripts
sometimes are unreliable. I found that LVM just adds uneccessary
complexity.

Also, you may want to have different RAID-sets on your disks.
One thing I used for a long time was the following:

Disk 1: 10GB 1/2 RAID 1 for system, rest 1/4 RAID 5
Disk 2: 10GB 2/2 RAID 1 for system, rest 2/4 RAID 5
Disk 3: 9.9GB 1/2 RAID 1 for home, rest 3/4 RAID 5
Disk 4: 9.9GB 2/2 RAID 1 for home, rest 4/4 RAID 5

The RAID5 being a shared data partition. I also put a
swap partition on disks 3 and 4 (100MB each). You
cannot do this whan using full disks.

- hardware (how to fit 8 or more disks in a PC case with decent
ventilation).

Any suggestions of the type of boxes/racks I should use?

3ware make 3-disks-in-2-5.25"-spaces trays. They are expensive
and the fans they use die after about a year. When a fan goes
bad, the tray helpfully warns you about it by beeping loudly and
constantly. The fans are not the easiest to find (60 mm or some
such). Be prepared to hear a lot of beeping.

I made my own trays. It was a lot of work and they look ugly but
it was cheap and they do the job. The ventilation is superior to
commercial trays (120 mm fan than moves a lot of air quietly and
reliably).

There is a 4-in-3 mounting with 120mm fan made by Coolermaster.
Quite cheap, about 22 Swiss Franks, which is something like
15 USD/EUR. Ugly, but cools well and perfect for making disk-packs
that fit in standard PC cases. I have one of these in a server
running something like 3 years now and the fan is still fine.
It does fit in 3 standard 5 1/4" bays if you remove the plastic
side-rails. They are just screwed on. Mounting holes also match
any standard case. The fan should direcly blow in outside air.

Here is a link (the coolermaster website is broken both
in Opera and Firefox....). Sorry, it is German:

http://www.pcp.ch/Cooler-Master-4-in-3-Device-Modul-1a12170083.htm


One very important thing about RAID that too many people
overlook : don't make an N-disk array from N disks of the same
make and model bought the same day. Our sysadmin at work did and
both drives on a RAID-1 array failed within days of each
other...

Hehe. For high-reliability applications that is very good advice.
You may even want a 3- or 4- way RAID 1 (which Linux can do)
for these with 3 or 4 different disks. For ordinary use, I
would say it is enough to have a cold spare ready and rebuild
the array immediately. You can also take tha array down until
you have the spare. But don't continue to use it in degraded
state.

I also recommend running a full SMART selftest on the disks every 14
days or so and to have email or text-message alerting both on RAID and
SMART monitor events.

And RAID is no substitute for backup, of course. It just makes
the event when you need your backup less likely.

Arno
.



Relevant Pages

  • Questions about camcontrol, hot-swapping, ciss and Compaq SmartArray
    ... Today I saw that one of my disks seems to be dead/dying in a RAID 5 array I have: ... I see messages for port 0 only, but varying ID 0-3, and I'm not sure what that means. ... After a while the error messages "went away", though the disks were/are still being used. ...
    (freebsd-questions)
  • rs232...
    ... Hello Scott and everyone else who will read this post. ... //immediatelly assigned to this array. ... // from the rs232 port), ... I thank Scott and all members of the news groups that have helped me on the ...
    (microsoft.public.vc.language)
  • Re: Clustering Newbie - SAN Advice
    ... I have not used that particular unit, bit it does meet my basic criteria of having the controllers and cache on-board the array. ... Senior SQL Infrastructure Consultant ... A SAN generally has gigabtes of cache and uses large internal block sizes ... The SAN or Smart array will dictate what internal connection the disks have. ...
    (microsoft.public.sqlserver.clustering)
  • SUMMARY: Network attached storage ideas
    ... Let me start by clarifying that the issue with Solaris ... and NFS is not with Solaris but with how the Snaps ... 3310/3311/3510 array). ... more expensive SCSI or FC disks. ...
    (SunManagers)
  • Re: New disks in 7026
    ... Subject: New disks in 7026 ... Would I create a second array with these two disks? ... drives as non-array drives, but not add to an existing array. ... Array -> IBM PCI SCSI Disk Array -> PCI SCSI Disk Array Manager, ...
    (AIX-L)