Archive for the ‘Servers’ Category.

HP DL320 G5

DL320 G5 FrontThe DL320 is HP’s low end rackserver, similar to IBM’s System x3250, which i reviewed earlier.

This machine was purchased for use in a colocation center as a general web and mailserver for non-profit purposes, using Debian GNU/Linux 4.0. As such, there were serious budget constraints involved, which is why the machine chosen is on the lowest end of available (rack) hardware - as such, the machine doesn’t have a diagnostic panel (like HP’s DL360 G5 do, or IBM’s x3550 or x3650).

DL320 G5 Disks
The machine was ordered using the new ALSO IVIS HP TopConfig configurator, and as such was shipped assembled (the delivery time was 9 days). Most in-stock articles have builtin CD or Diskette drives, which are a factor in price. With configured machines, you can save a few bucks with these things, and also get a better memory configuration (this is very important in low end machines, because they usually only have 4 memory slots (compared to 8 or 12 that better machines have)).

Here’s the configuration:

DL320 G5 Trays

  • HP DL320 G5 with 2.13 Ghz Xeon DC CPU
  • 2×1 GB Memory
  • 2×160GB 7.2kRPM SATA Disks

Unpacking and opening

This machine ships with the usual features known from HP - iLO 2, but without any licenses for advanced use (you can use it with a serial console that can be accessed with a Java Client). Also the slide-out serial number tag (an awesome idea in my opinion).

DL320G5 Mainboard
The machine also shipped prebuilt (which is nice, as i’m probably the most lazy person on earth) - the hard disk in the machine look like they’re hot -pluggable, and they theoretically are - but the onboard SATA controller does not support hot plugging. If you want that, you’ll need to buy a RAID controller from HP. I like this setup a bit more than IBM’s Simple Swap harddisks, but there’s no real advantage in daily use.

The components where assembled nicely, and the build quality is generally good.

Interiors

The general interior of this machine is very much the same as the x3250 - with a rather interesting difference: the machine has built in USB port, with lot’s of free space around it. I really have no idea what you’d want to do with this (if it was a desktop machine, you could probably use it with Flash for Vista’s ReadyBoost technology).

DL320 G5 Internal USB Port
The fans in the machine are not hot pluggable, but redundant. The cabling is en par with the x3250, but there’s a slight disadvantage in extensibility. While the x3250 has a separate slot for installing specific IBM RAID controllers, there’s no such thing in the HP machine - you’ll have to use one of the PCI-E 8x slots for installing one of the HP RAID controllers. I do not consider this to be a problem in daily use either, because there’s usually not much need for expansion adapters in 1U machines.

Booting the server

The onboard controller is a standard Intel AHCI SATA controller (You will need to enable RAID functionality in the BIOS, and leave the RAID BIOS unconfigured - this will expose both disks to Linux using the AHCI driver). As of Debian GNU/Linux 4.0, this controller is now natively supported in the default configuration. There are many options in the BIOS to save a variety of contact information. Maybe useful in enterprise environments, didn’t play with them that much.

DL320 G5 Field Manual
iLO configuration works fine as usual, but netbooting the machine was awkward - it refused to load our WDS bootloader several times, but succeeded finally (WDS then boots RIS, which then boots PXELinux - here’s how to do this). I have no idea if it is just an old switch acting up, or an actual problem.

Resumee

Is this machine better than the x3250? Make up your own opinion. The only real world advantage i’ve seen is that iLO 2 is included in the machine, and allows you to access a serial console for free. The iLO advanced option which allows KVM access (and a variety of enterprise integration features) is quite expensive at about 450 CHF. The x3250 does not ship with an RSA II, but the RSA II is only 250 CHF, a lot less than the KVM access license.

DL320 G5 Cabling
When you’re using Windows, you can use EMS to use the serial port, on Linux or other Unix based OSes you can usually redirect both the console and the kernel to a serial port (and of course you can redirect the BIOS).

I liked the disks, which seem to handle much more nicely than the Simple Swap SATA disks from IBM.

IBM’s ServeRAID Manager may send spurious messages after an IP change

ServeRAID Manager 8.40
IBM’s ServeRAID Manager in the Version 8.0 does not handle IP changes of the host machine cleanly. In my case, it continued to send information messages to the ServeRAID Port (34571) on the old IP Address. See the screenshot to the right on where to change this.

On this topic, i’ve found a very interesting link, IBM’s ServeRAID Reference, with lot’s of pictures and detailed specifications of each controller.

Layer One sucks - they still have power outages

Layer One sucks. Big time.

They’ve had power outages before, and again. However, it seems that they didn’t change anything. This is the fifth power outage, and we’re there for at most 1.5 years.

Today, there was a smaller power taking down only of the two power lines we had. But it still lasted for several hours, and recovery and information was incompetent and slow. Don’t go to Layer One. Their Power Grid sucks as much as their service and their information policy.

USB floppy drives during Windows Server 2003 R2 setup

So you’ve bought a new rack server, like the IBM System x3250. But your Boss or your customer was to cheap to buy an RSA II card. And now you need to install Windows Server 2003.

This is usually the part where the fun begins. Newer servers do not have a floppy drive, but the only way to load drivers into Windows Server 2003, besides RIS or remastering CDs are floppy drives.

Getting an USB floppy drive is no big deal, you connect it to the machine, it boots, you press F6, select the storage adapter driver, format your hard disks, and then setup asks for the floppy again and again. Bummer.

The problem is that the first part of setup (loading the Mass Storage driver) is not handled by Windows, but instead by the BIOS’s floppy emulation. But the latter part, after formatting the hard drive is handled by Windows. And some of them are not recognized by the builtin USB storage drivers.

In my case, i had an iomega USB floppy with a built in card reader (don’t ask). I used device manager to find out the vendor and product id of this USB floppy.

I opened the txtsetup.oem supplied with my mass storage driver, and modified the section that my mass storage driver had.

I added the following line, directly to after the SCSI adapter itself:
id = "USB\VID_08BD&PID_1100", "usbstor"

I had no idea if this would work at all, but it did.

For your reference, i’ve included my txtsetup.oem, which works with iomega usb floppy dirves.

SERR/PERR errors on IBM’s System x3650

After updating all the new firmware on a newly delivered IBM System x3650, i installed the operating system Windows Server 2003 R2. The machine worked fine, but crashed mysteriously after about 3 hours into operation, logging a RAID failure into the RSA.

When looking further through the RSA error logs, i’ve found this error occuring multiple times:

Unknown SERR/PERR detected on PCI bus Chassis#=NA Slot#=0 Bus#=0 Dev.ID=0x25e3 Vend.ID=0x8086 Status=0x0 DevFun#=0xff

I’ve called IBM support, and they told me that i should power cycle the machine after a firmware update. I did that and then continued to setup the machine. It’s been working flawlessly under heavy load for the past 3 months.

I’m going to remark this for the feature - after a firmware upgrade on a server, do a power cycle.

Disk IO performance is dependent on the number of disk arms

If you already know where i’m going with this after reading the subject, you can stop reading now.

As hard disks get bigger and bigger, servers in the Small Business environments are usually setup with too few disk arms to satisfy performance needs.

The problem is quite simple - a standard 36GB 2.5″ SAS Hard Drive can read data at factor x, and can do y IOPs per second.

A standard 72GB 2.5″ SAS Hard Drive can read data at factor x.1 (or similar), and can do y IOPs per second.

As you can see, disks get bigger, but they do not really get faster. If you need more IOPs per second, you need more disks.

If you have a legacy systems, with a considerable number of disk arms (more than 10), each at 4 or 8GB capacity, and migrate this setup two a new system with a RAID1 over two 147GB disks, you will get _WORSE_ performance than the old system.

And if we look at consumer hard drives, with 750GB, 1TB per disk, the performance gets even worse.

This is usually not a problem in more professional environments where systems are purchased by requirements, but in Small Businesses systems are usually purchased by the amount of money that is around for them.

Never forget about the need for disk arms.

Buying tape drives for small businesses

Backups are very important, and the media and technology used for them are even more important. While Disk to Disk is the best form of Backups for home users, tapes still make a lot of sense in companies, because they make it a lot easier to get something off-site for disaster recovery.

However, tape technologies available for x86 servers are numerous. On the other hand, choosing tape drivers is as easy as it gets. The more expensive they are, the better they are.

I only recommend one type of tapes to customers - LTO. LTO tapes and drives are among the fastest and most reliable on the market. They are more expensive than cheaper alternatives like the VXA drives, but they are trouble free, which can’t be said about VXA drives.

LTO2 (200GB) Half-Height external drives can be had for about 2′500CHF from Tandberg. Buying them directly from HP/IBM, they are a bit more expensive, about 3000-3500CHF. Do not buy internal tape drives when they’re not from your server manufacturer, as this could cause trouble down the road.

LTO3 drives are a bit more expensive, but pack 400GB instead of 200GB. If that’s still not enough, you should consider purchasing a small tape libary - LTO 2 libraries with 8 tapes can be had from about 8000CHF, which is quite a bargain.

Remember, you can’t extend the capacity of your tape drive, except if you have a library. So if you buy a LTO2 drive, but need more than 200GB of storage, you should buy LTO3. If you think you need more than 400GB of storage, buy a LTO3 tape library.

I’ve had experiences with DLT (which are usually to small), VXA (unreliable), and 4mm tapes (unreliable). What i’ver never worked with are Sony’s AIT tape drives - i would be interested to hear some experiences with those drives.

Redundant equipment has to be monitored

While this might sound pretty much obvious, i’ve seen this more than once.

The problem with redundant equipment is that nobody notices when it fails. Okay, this is pretty much the target of having redundant equipment, but if nobody replaces the failed component, you’ve just lost that redundancy.

Better servers with multiple fans, power supplies, etc. usually offer integrated diagnostics with audible alert, which is usually enough for a small business (running MOM on your only server has limited usefullness). But smaller machines, lacking any redundant PSU/fans usually don’t have embedded diagnostics. These won’t make any audible alert when a disk in a RAID set fails.

On IBM servers with ServeRAID adapters, you can install the ServeRAID management program from the ServeRAID application CD (not the drivers CD, there are two of them). The ServeRAID management program is downward compatible with almost all ServeRAID controllers, as long as you have the IBM driver installed (for the 7e or similar controllers, there is also an Adaptec driver which works fine, but ServeRAID management doesn’t recognize it.

ServeRAID management can be configured to send mails automatically in case of a disk failure.

Virtual Server 2005 R2, Windows Server 2003 and Broadcom NetExtreme II cards

Interesting issues with Microsoft’s Virtual Server.

A new IBM x3650 with two Broadcom NetExtreme II cards, running Windows Server 2003 since a few months, flawlessly.

After installing Virtual Server 2005, everything went mayhem. Some machines were still able to contact the server, some not. It looked like something was horribly broken, and at first i had no idea why something like this could happen.

After searching the web, i’ve found a few references to this and similar problems with newer NICs and Virtual Server.

The Broadcom NetExtreme II seem to have a special problem related to Virtual Server 2005, with IPMI. There is a fix from Broadcom available

IPMI disabling tool [Mirror]

Just a short network interruption, no restart necessary.

But there are other problem with modern network cards and Virtual Server 2005 (and possibly VMware’s offerings, but i don’t know that).

There’s a KB entry which talks about disabling checksum/segmentation offloading when using Virtual Server 2005.

Creating simple graphs using rrdtool

rrdtool graph for temperatures
As discussed in the previous post, you can gather temperature data from RSA II or iLO cards using SNMP quite easily.

While the data itself can be good enough to make a decision, executives in a company always like nice diagrams. So my first try was to load the CSV-like datafile generated using said script into Excel, and make a diagram out of it. But Excel is restricted to 255 parameters per axis, which was severely limiting.

I’ve been using Cacti for quite some time, but wasn’t willing to implement it because we’re mostly a Windows shop, and my plan was to integrate the linux boxes into Operations Manager 2007. Cacti uses Tobi Oetiker’s rrdtool to create the graphs.

Creating graphs using rrdtool is quite easy, actually. I wrote a simple script that handled this:

makerrd

Creates the appropriate rrd file. Replace the unix timestamp as appropriate. The last value on the RRA lines is the number of values saved into the data file.

#!/bin/sh
rrdtool create test.rrd           \\
           --start 1176465000     \\
           -s 300                 \\
           DS:temp:GAUGE:600:U:U  \\
           RRA:AVERAGE:0:1:5000

inputrrd

Loads the data from the simple CSV-like file into the RRD file. The more elegant approach would be to load the data directly from SNMP into the rrd database, but i’m no programmer.

#!/bin/zsh
while IFS=';' read timestamp temp ; do
        temp=`echo $temp | sed 's/\\..*//;'`
        rrdtool update test.rrd ${timestamp}:${temp}
        if [ $? != 0 ] ; then
                rrdtool failed
        fi
done < machine

makegraph

Creates a graph from the data in the rrd file. The HRULE lines create lines for error margins. In this case 35C and 30C.

#!/bin/sh
rrdtool graph temp.png                       \\
        --start 1176465385 --end `date +%s`  \\
        DEF:mytemp=test.rrd:temp:AVERAGE     \\
        LINE2:mytemp#0000FF                  \\
        HRULE:35#FF0000                      \\
        HRULE:30#FFA500

See the created graph to the right. Of course, rrdtool has much more options and can create much nicer graphs.