Date: Fri, 20 Mar 1998 14:26:44 +0000 From: "Jonathan F. Dill" To: tpo (others DELETED!) Subject: a good, cheap RAID Hi guys, Here are the numbers (all from Linux 2.0.32 custom kernels, RH 5.0+updates systems): -------Sequential Output-------- ---Sequential Input-- --Random-- -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks--- Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU IDE 100 2425 93.6 6169 24.2 2997 21.5 2454 91.8 10872 24.5 73.5 1.7 RAID-5 100 2252 87.5 10720 25.6 4346 21.9 2550 94.4 18052 39.7 136.8 3.1 Cheetah 100 2890 95.6 11156 50.4 3847 29.1 2622 91.0 12845 31.2 253.7 6.6 The 4 GB Cheetah runs at 20 MHz sync on my dual-PPro. The software RAID-5 has similar performance, but is comprised of 4x Maxtor Ultra-IDE disks running on a single AMD K6 system. The first IDE numbers are for a single Ultra-IDE disk on the same AMD K6 system. I have 2x 8 GB Maxtors and 2x 7 GB Maxtors, the 1 GB left over on the 8 GB disks is currently used for swap and a cold-mirrored root partition, and the remaining 4x 7 GB makes a 20 GB RAID-5. The kicker is that the AMD K6 system cost less than $5,000--I plan to put in a second IDE controller and 4 more disks plus a spare disk and removable cartridge which will bring the grand total to about $8,500. I have tested shutting down the system, pulling out one of the disks, and then starting up the system. It worked surprizingly well--performance seemed to be sufficient that I could leave it running in degraded mode for awhile if necessary--I should have done some FS benchmarks, but I didn't! I have also tested a couple times what would happen if the system crashed without a clean shutdown of the RAID-- mkraid --only-superblock did the trick. I could really use "hot" reconstruction though--running ckraid took about 2.5 hours! When I have 8 disks in the system, I will probably leave at least 1 of the disks for hot reconstruction, or I may keep 2 separate RAID-5 of 4 disks each, so the total capacity will be around 40 GB. Of course, it remains to be seen how the system will behave with multiple clients and/or heavy load. I think the next step would be to upgrade the motherboard and processor, perhaps to a dual Pentium II. One hot tip--tuning the disk drives with hdparm made a significant improvement over the default parameters--I turned on 32-bit I/O and DMA, and then maxed out multcount and turned on unmaskirq to get a handle on all the interrupts that would occur with heavy access--this gave me an improvement of about +3 MB/sec per disk. Before this tuning, the bonnie results were more comparable to the single IDE drive. A few more specs on the hardware: CPU: AMD K6 233 MMX Motherboard: Asus TX97E Memory: 2x 32 MB 60ns EDO SIMM (could use DIMM's instead, max of 256 MB) Case: 10x 5.25 exposed, 2x 250 W redundant hot-swap power supply Net: 2x 3C905 Boomerang 100baseTx running in split RX/TX mode Video: cheapo generic 1 MB Trident VGA Disks: 2x 8 GB Maxtor 88455D8, 2x 7 GB Maxtor 87000D8 The disks are jumpered for C-SEL and installed in removable metal cartridges, and I'm using C-SEL cables to connect to the brackets so if/when I swap out a disk all I have to do is jumper for C-SEL and plug it in. Of course since they're IDE I'll have to power down for a minute to make the swap. Other goodies: 1400 VA UPS Fiber SHM tied to another workstation for "Out Of Band Management" (I also have lilo setup with serial options over this link, so I could login to the other workstation, startup minicom, and eg. bring up the RAID system in single-user mode if necessary) Why bother will all of this? I have a low budget and a lot of systems to maintain-- we're up to about 60 unix workstations and 100 PC/Mac maintained by 2 people. This RAID will serve out IRIX share trees over NFS, and shared software and utlities, software patches, TCP/IP to SMB sharing of disks and printers etc. etc. The RAID will be mounted over cachefs on the IRIX systems so that frequently accessed files will be cached on the local disk for faster access and decreased network load. Also, I plan to set it up as an IMAP server and locate people's login directories on the server--the idea is that the server will be very reliable, so no matter what else is going on on the network, people will still be able to login and check their mail and get some work done. The login space on the RAID will be for mail and web browsing, but to encourage people to use the disk space on their own workstation, there will be a directory "work" with special links to the directory on their own workstation--when they login, the system cshrc will attempt to relocate to their local disk via these links, or else the space on the RAID will act as a "failsafe" eg. if their workstation is down for any reason. -- "Jonathan F. Dill" (jonathan@carb.nist.gov) http://www.umbi.umd.edu/~dill