Redundant equipment has to be monitored

While this might sound pretty much obvious, i’ve seen this more than once.

The problem with redundant equipment is that nobody notices when it fails. Okay, this is pretty much the target of having redundant equipment, but if nobody replaces the failed component, you’ve just lost that redundancy.

Better servers with multiple fans, power supplies, etc. usually offer integrated diagnostics with audible alert, which is usually enough for a small business (running MOM on your only server has limited usefullness). But smaller machines, lacking any redundant PSU/fans usually don’t have embedded diagnostics. These won’t make any audible alert when a disk in a RAID set fails.

On IBM servers with ServeRAID adapters, you can install the ServeRAID management program from the ServeRAID application CD (not the drivers CD, there are two of them). The ServeRAID management program is downward compatible with almost all ServeRAID controllers, as long as you have the IBM driver installed (for the 7e or similar controllers, there is also an Adaptec driver which works fine, but ServeRAID management doesn’t recognize it.

ServeRAID management can be configured to send mails automatically in case of a disk failure.

One Comment

  1. Nick:

    Hi,
    I was wondering since ServRAID v8.4 not connects with the SNMP subsystem directly, what needs to be modified to get notifications sent to an IBM director server. I can’t use the Director agents to monitor the hardware. But traps sent through SNMP would work. We used to use Tivoli and this worked with V6 of serveraid.

    Any insight would be appreciated.
    Thank You,
    -Nick

Leave a comment