Archive for the ‘Servers’ Category.

IBM System x3650

IBM System x3650

A few weeks ago, i deployed my first IBM x3650. In general, i think the x3650 is a very nicely made machine. I’m not going to talk about features which are pretty standard in all middle class IBM machines, like light path diagnostics.
Here’s the configuration i ordered:

  • System x3650 Xeon 2Ghz DC, with 2x512MB Base Memory, 2.5″ SAS Open Bay
  • 2GB additional Memory
  • RSA II slimline
  • 4x 73GB 2.5″ SAS Disks
  • Redundant PSU Kit
  • PCI-X Riser Card
  • PCI-X U320 SCSI Adapter

Unpacking and opening

IBM System x3650 Box
The server arrived on a standard wooden pallet, secured in nicely in the box, with styropor for dampening around it. The box also contained an UPS powercable, a detailed manual for installing options, and of course the rack mount kit with cable tray.

After removing the server from the packaging, which was rather easy to do, i inspected the machine for damage or faults. There were none to be found. One of the interesting think was that the 2.5″ SAS Disk blanks don’t hold very well in the system. It didn’t look good, but it’s not a real problem.

One of the first things you will notice is that this server no longer has PS/2 ports. This might be a problem if you have an older KVM switch in operation.

Installing options

IBM System x3650 Open, Board
After opening the server, everything had the usual IBM color codes for hot-plug components and touch points. There is a detailed diagram of the server on the inside of the upper lid. Near the CPU/RAM was a table that shows valid memory configurations, which is very nice when you have to upgrade memory.

It’s very nice to see that IBM keeps the in-server documentation on a very high level. For almost every part, there is a short but very concise description on how to maintain, configure or remove it. This is especially helpful if you experience a problem, and have to swap out a component.

IBM System x3650 Memory
Installing memory into the system was a breeze. Went without any problems. As you can see, the x3650 can take up to 12 sticks of memory, also allowing for highly redundant memory with features like Memory Mirroring and Memory Spares. I’m not going to waste my memory on such a feature, though.

I also had to replace the internal PCI-E riser card with a PCI-X riser card, because IBM doesn’t have any PCI-E SCSI adapters. This wasn’t a problem to do either.

IBM System x3650 Fans
Together with the redundant power supply, there also comes a supply of redundant fans. The system ships with 5 hot plug fans installed, and the redundant power supply kit comes with 5 fans extra, for a total of 10. They aren’t even that loud, since the usually idle at around 40%. The fans have a very slick spring mechanism, which makes them bump up in their casing, to allow for easy removal.

Installing the RSA card was a bit tricky. It didn’t click into the board as easily as i thought. I had to apply way more force than i had to on previous servers. It went in fine though, and is now working without a hitch.

Booting the server

The hassles begin when booting the server for the first time. Luckily IBM has redesigned it’s server support pages, allowing quick and easy access to all firmware updates at once. As usual, you need to flash a custom firmware to the RSA card, to allow proper support. This time around, i also had to update all the SAS Disks firmware, which was quite a hassle because this didn’t play well with the RSA II CD drive, i had to use the local one.

Installing Windows Server 2003 R2 was uninteresting, went fine without any troubles.

Resumee

Another good x86 from IBM server i can only recommend. If you’re interested in more server reviews like this one, please leave a comment. Currently, the only other machine i could write about is the IBM System x 3250.

Oh, and if you’re interested in buying an IBM System x 3650, give my current employer DATALINE AG a call – we’re an IBM Premier Business Partner.

Why you should buy a service processor card

Every major server manufacturer offers some sort of service processor for their server. I’ve got experience with IBM’s RSA II cards, and with HP’s iLO. Other server manufacturers like Dell have similar solutions, but i’m not familiar with them.

These things aren’t usually that costly (around 500CHF), and allow you to power on/off the server from remote – this comes quite handy if the server is at a remote site. You can also access the local keyboard/mouse from remote, this usually requires USB support in the OS (not a problem anymore, i think). Both iLO and RSA II support remote KVM, remote floppy (for upgrading the BIOS or similar tasks). They also both support secure remote access through SSH and HTTPS.

Another very important thing is access to the system error log, usually only visible through the BIOS or the monster management Applications (Insight and Director).

There are some key differences between iLO and RSA II:

IBM’s RSA II

IBM RSA II
IBM’s RSA II slimline is an add in card, usually located directly on the motherboard and connected through a proprietary connector. It allows remote alerts through SMTP, and comes with an outdated-looking web interface, but it offers all the usual features like remote KVM, system log display, remote storage, integration into LDAP directories, etc. The ethernet connector is usually integrated into the main system board, but only active with an appropriate RSA card installed. There’s also a non-slimline version of the RSA II card, which offers a bit more functionality, but i’ve never used one of them. RSA II slimline is a bit cheaper than the iLO advanced license.

HP’s iLO

HP’s ILO
HP’s iLO, integrated into all better HP servers doesn’t cost a penny in the standard version, and is sold with the server. No need to install hardware. You can activate features like remote KVM, and LDAP directory integration using license keys. This approach is in my opinion a lot better than IBMs, because you do not need to purchase and install additional hardware. What iLO can’t do is to make notifications by mail. Of course, SNMP is supported, but smaller businesses might lack the infrastructure for SNMP traps.

As you can see, both products have their own advantages and disadvantages. I think IBM should polish the look of it’s RSA Webinterface a bit, and HP should add alert support through E-Mails. Both products lack time sync through SNTP, for some reason. Maybe they use the systems internal clock, but i wasn’t able to find much about this topic (i didn’t look very far, either).

Both HP and IBM make excellent servers – IBM seems to be a bit slower in technology adaption than HP, though they are leading on other fields like Blade servers.

Backing up your small business

Backups are nothing new – everyone, even if they’re not affiliated with IT directly knows this word.

Unfortunately, the term “Backup” doesn’t actually describe any concrete measure. One of the important things is to know for sure what you want to protect yourself against with a backup – this will heavily influence your choice of available solutions.

So, what are possible things that could happen to or in your business, which would require a restore?

  • Accidental deletion
  • Accidental modification
  • Disk crash
  • Hardware crash
  • Software crash
  • Malicious deletion
  • Malicious modification
  • Environmental disaster, destroying the whole building

I’ve ordered the items according to the numbers i’ve seen them happening. Note that this might be different for bigger businesses.

So how do we protect our business against accidental deletion or accidental modification? When you’re using Windows, the easiest way to guard against this is a feature called Shadow Copies. Shadow copies are nothing more than volume snapshots with a nice GUI around, and a way to access them over the network. Once you have this implemented, you don’t want to miss it. It allows users to recover deleted files on their own, quickly. It also allows you to recover modified files. You can set the timing for creation through the usual windows scheduler. Shadow copies just use disk space – implement them now if you haven’t already.

The next step on the list is a disk crash. This one is easy – just read this post about hardware redundancy. However, what to do if two disks fail at once, or the RAID controller?

Please note that neither Shadow copies or RAID replace a backup – but they are part of your backup strategy.

Complete hardware crashes happen seldom. And if they do, they usually don’t destroy your data. The best safeguard against hardware crashes is a maintenance contract or service pack on the machines – no matter what is broken or missing, someone will come over and replace the part. No backup necessary.

The worst that could possible happen is a software crash – a bug in the filesystem driver destroying all your data, or leaving your SQL database in limbo. You will need a complete backup for that, and this is where it starts to get expensive. You can achieve complete backups to disk (usually to a SAN or NAS), which is a nice for a bare metal restore, but has other problems.

What about malicious deletion, malicious modification? They could have happened months ago, until somebody else works on the data, and notices the problem. This is why you need backups that go back for quite some amount of time. You can buy a lot of disk space for your SAN or NAS, but that would get expensive.

And what about a environmental disaster, destroying the company building and everything that was in it? You will need some way to keep your data off site for that. You can move hard disks off site, but they are quite fragile. Replicating the data off site could be a solution, but has other problems.

The easiest way to get done with the latter three issues are tape backups – i’ve seen many people laugh at tape backups, but they are a great solution to many problems. LTO3 tape drives are fast, really fast. They also pack quite an amount of storage, with 400GB physical and 800GB theoretical maximum. you can get 500-650GB on them without problems. This allows you to do daily, full saves, and storing some of these full saves at an off site location. If you need more than one tape to save all your data, you can use tape changers. They were expensive once, but for now they come in at about 8000 CHF.

Tape drives are also an easy concept to explain to a non-IT person, because it involves physical objects.

You could also implement a multi-tiered infrastructure, backing up to disk first, and to tape later, which is what many enterprises do. But usually the complexity involved in such a setup by far outweighs it’s advantages.

Hardware redundancy in Small Businesses

When talking about hardware, the main difference between a “PC” and a “Server” is the amount of hardware redundancy the manufacturer has incorporated into it’s design.

  • Disk redundancy
    Also called RAID (almost everywhere) or Disk Protection (in the System i world). Disk redundancy ensures that the loss of a single disk drive doesn’t result in loss of data. There are many ways in which can raid be implemented – starting with purely software solutions provided by the operating system (like in Windows Server and all Linux distributions), with solutions that use a part BIOS/part driver solution (SOHO/Consumer equipment because Windows Clients lack software RAID), to full blown hardware solutions incorporating a co-processor for checksum calculation (for RAID5/6) and a battery backed write cache.
    I would go as far and say “if it doesn’t have some form of disk protection, it’s not a server”. With software RAID, all you need are a few more disks.
  • Memory protection
    Mostly called ECC / Checksum Memory / ChipKill memory. ECC ensures that defective memory can’t cause silent data corruption or system crashes. I don’t know of any server manufacturer which doesn’t ship their servers with ECC memory – i consider it an absolute must. ECC can usually recover from single bit errors (and write to the logfiles) and it can halt the system in case of a multiple bit error (and write to the management CPU log).
    There are newer technologies out like Memory Mirroring, which allows of whole banks of memory to fail, and recover without any downtime. This latter feature usually needs twice the memory, and is thus prohibitively expensive.
  • Power redundancy
    Multiple power supplies are available as soon as you leave the lowest priced server segment. Power redundancy is a good thing for a variety of reasons. Having a day of downtime because of a blown power supply is not funny – a second power supply can help. A second power supply also helps you if your UPS has a problem – this is actually the most common situation where a 2nd PSU helped me – with broken UPSs happening more than power downs (at least here), a second PSU is an insurance that a defective UPS can’t bring down your production server. Of course this doesn’t work if you pull the second PSU into the same UPS – plug it into the wall, or into another UPS.
  • Cooling redundancy
    Some server manufacturers ship their secondary PSU together with a redundant cooling kit. Redundant cooling is as important as a second power supply, because downtime because of a single blown fan is embarassing. Most fans are hot pluggable, allowing you to keep the server up and running even when replacing the broken one with a spare.
  • CPU redundancy
    This is a nice add-on feature. Most 2 CPU machines support an automatic reboot to 1 CPU when one of the CPU fails – of course you don’t buy a second CPU just for this, but it’s really nice to have if you have 2 CPUs anyway.

There are many more hardware redundancy techniques, but most of them are not meant to be used in a small business. Things like Multipath IO, fail over blades, etc. are just far too expensive.

Application redundancy in Small Businesses

Application redundancy is the best, and most expensive way to make your infrastructure resilient against problems. Application redundancy requires at least two machines, which then both serve the same application. Application redundancy is also called clustering, replication, multi master replication, etc.

Basically, there are two different architectures to achieve application redundancy:

  • Shared Storage
    Shared Storage means that the storage is shared between the two machines. Note that a shared storage does not prevent an Active/Active configuration, with both machines active – however, care must be taken by the programmers of the application to support this mode of operation. The good side of Shared Storage is that you can make Active/Passive configurations with any software – the downside of Shared Storage is that if the storage is down, nothing works.
  • Shared Nothing
    Shared Nothing means that there are no shared components between the two or more machines. There is no longer a single point of failure. Shared Nothing is almost always implemented in an Active/Active configuration, ensuring that you don’t waste energy on a machine doing mostly nothing.

Shared Nothing is usually the more elegant approach, but it’s not always supported in the application itself. Or it might require special licenses. But so much for theory – how do these things look in practice?

Here’s a list of services than can be made to support redundancy in a simple fashion.

  • Active Directory
    Active Directory is designed for multi master operation. Install a second domain controller, and you’re set. This is easy. Except if you’re using Microsoft’s Small Business Server product. Upgrade SBS to a full blown windows server. SBS is designed for really, really small companies which can’t afford more than one server. Multiple DCs are allowed by all Windows Server Editions, except SBS and Web. Active Directories Multi Master Replication gives you a Shared Mostly Nothing configuration – read up on FSMO roles
  • Exchange
    Exchange 2007 offers a feature called Cluster Continuous Replication. CCR requires the enterprise edition of Exchange 2007. This allows you to do you to create a Shared Nothing configuration easily.
  • File Serving
    DFS is the way to go. Upgrade to R2 if you’re thinking about implementing DFS. DFS allows true multi master replication with binary diffing (for WAN connections). DFS also allows you to implement a unified naming convention for your shares. You should implement DFS even if you’re not using the replication contained. DFS is supported since Windows 2000, but not on Web and SBS editions.
  • SQL Server
    I never worked with SQL server, and don’t know much about it. Our ERP software uses the System i (where they focus on hardware redundancy, and make application redundancy impossible to pay for smaller companies). If you know about SQL Server, write a comment.

Making a small business server reliable

Servers for small businesses are never purchased according to need, but according to budget. This is a (sad) reality, that can’t be changed on a technical level. Most small businesses don’t have the amount of resources necessary to purchase an IT infrastructure according to their needs.

It doesn’t matter if you’re a service provider, or if you are the one that has the internal IT as a side job in a small business, the main target is still to get an infrastructure which doesn’t create troubles. Even small businesses depend on their it infrastructure – while a bank like the UBS would probably be dead in the water if their infrastructure is down for a day, it isn’t that bad for a small business. But without an infrastructure, they will still get problems, missed deadlines, and missed business.

There are multiple ways to make infrastructure more resilient to problems, usually a combination of these methods is used:

  • Application redundancy
    Application redundancy has many names – clustering, replication, you name it. The trick is here that software and data can be spread throughout multiple machines. This makes it possible to provide high reliability with cheap hardware. The problem here is that application redundancy is usually expensive (with commercial software), support for it might not exist for SMB software, and that it increases administrative overhead.
  • Hardware redundancy
    Almost every piece of hardware can be implemented to be redundant. The most common form of hardware redundancy is RAID – usually implemented in even the cheapest servers. Hardware redundancy is not a fix for software problems, and it can’t save you from every disaster. The plus side is that it usually is completely abstracted from the end user, and thus only costs money.
  • Backups
    Backups exist in many forms. They range from simple solutions like a single tape drive, which stores full copies of the complete server to a single tape, which can be stored off site. Another form of backups are volume snap shots, called “Shadow copies” in the windows world. The aren’t suited for disaster recovery, but allow easy recovery of data deleted by end users.

I will write about each of these topic in the coming week, and how to use them in a small business.

Purchasing servers with SSCT and ALSO IVIS

I really like purchasing new hardware – it’s generally a fun thing to spend someones else’s money on shiny new toys in a 19″ form factor.

While people in bigger companies usually purchase their hardware by the sizing done on the requirements, in SMB environments you purchase by budget, and hope the budget is big enough to at least get near the requirements.

I usually deal with HP and IBM servers – i think they’re the same in pricing, features and problems. When trying to get most out of your budget, it’s usually necessary to draw up several different configurations, in order to get as many requirements fulfilled while not being over your spending limit. If you have a hardware supplier, they will can make you a quote, and tell you what it’s going to cost, but it’s usually easier to make a draft config on your own, and then ask your hardware supplier to give you a quote and delivery time.

In case you’re not an IT company, and you don’t have a direct ALSO IVIS account, i suggest you to get one (you can get one easily even if you’re not directly an IT company, you just need a “HR Auszug” and a letter signed by a “Unterschriftsberechtigter”). This will also help you to see if your hardware supplier is trying to make too much profit on you (they will have much more hw turnover than you, and thus even lower prices).

The easiest way to get a server configuration is to use the vendors supplied tools. HP offers a web interface, which you can access through your ALSO IVIS login, and get direct quotes. For IBM equipment, it’s a little bit more complicated (but IMHO nicer). You can download SSCT, the Standalone Solutions Configurator Tool. This tool can help you to configure servers, but you will only be able to see the official list price – you will need to cross check with ALSO to see the “real” prices.

Please note that both configurators don’t have any facility to see whether something is on stock or not. If you’re in need of a server fast, it’s usually best to print out a PDF list of a server with additional equipment directly from the ALSO IVIS web interface, this way you can configure the server “by hand” with a marker pen, and see what is directly available on stock.

A generally interesting observation i made is that the professional 1U servers with all the necessary options (iLO or RSAII, second PSU, RAID controller with cache) have almost the same price as a 2U server with the same options. As space is usually not at a premium in the SMB datacenter (or “The room with AC”), you will almost always be better of with 2U servers in the long term. It’s a different thing of course when we’re talking about co-location.