Revolutionary. Cutting edge. State of the art. These are words and phrases that are bandied around so very many products in the IT field that they become useless, bland, expected. The truth is that truly revolutionary products are few and far between. That said, Cisco's Unified Computing System fits the bill.
To fully understand what Cisco has done requires that you dispense with preconceived notions of blade servers and blade chassis. Rewire your concepts of KVM, console access, and network and storage interfaces. Reorganize how you think of your datacenter as islands of servers surrounded by storage arrays and networks. Cisco had the advantage of starting from scratch with a blade-based server platform, and it's made the most of it.
UCS building blocks
A Cisco UCS chassis provides eight slots for half-width blades, each equipped with two Intel Nehalem processors, up to 96GB of RAM with 8GB DIMMs, two SAS drive slots, an LSI Logic SAS RAID controller, and a connection to the blade backplane. In addition, each blade is outfitted with a Cisco Converged Network Adapter, or CNA. The CNA is essentially the heart of the system, the component that makes UCS unlike traditional blade systems.
The CNA is a mezzanine board that fits a QLogic 4Gb Fibre Channel HBA and an Intel 10Gb Ethernet interface on a single board, connecting directly to the chassis network fabric. The presentation to the blade is two 10Gb NICs and two 4Gb FC ports, with two 10Gb connections to the backplane on the other side. The initial release does not support multiple CNAs per blade, or really even require one. But the CNA is integral to how the entire UCS platform operates, as it essentially decouples the blade from traditional I/O by pushing storage and network through two 10Gb pipes. This is accomplished through the use of FCoE (Fibre Channel over Ethernet). Everything leaving the blade is thus Ethernet, with the FC traffic broken out by the brains of the operation, the Fabric Interconnects (FI).
So we have some number of CNA-equipped blades in a chassis. We also have two four-port 10Gb fiber interface cards in the same chassis and two FIs downstream that drive everything. It's not technically accurate to call the FIs switches, since the chassis function more like remote line cards populated with blades. No switching occurs in the chassis themselves; they are simply backplanes for blades that have direct connections to the FIs. Physically, the FIs are identical in appearance to Cisco Nexus 5000 switches, but they have more horsepower and storage to handle the FCoE to FC breakout tasks. They offer 20 10Gb ports, and they support a single expansion card each.
The expansion cards come in a few different flavors, supporting either four 4Gb FC ports and four 10Gb Ethernet ports, or six 10Gb Ethernet ports, or eight 4Gb FC ports. This is in addition to the twenty 10Gb ports built into each FI. There are also three copper management and clustering ports, as well as the expected serial console port. The FI is wholly responsible for the management and orchestration of the UCS solution, running both the CLI and GUI interface natively — no outside server-based component is required.
Connecting the dots
Perhaps a mental picture is in order. A baseline UCS configuration would have two FIs run in active/passive mode, with all network communication run in active/active mode across both FIs and each chassis. (Think of a Cisco Catalyst 6509 switch chassis with redundant supervisors — even if one supervisor is standby, the Ethernet ports on that supervisor are usable. The two FIs work basically the same way.) They are connected to each other with a pair of 1Gb Ethernet ports, and they have out-of-band management ports connected to the larger LAN. The blade chassis is connected by two or four 10Gb links from each FEX (Fabric Extended) in the chassis, a set to each FI. That's it. A fully configured chassis with 80Gb uplinks will have four power cords and eight SFP+ cables coming out of it — nothing more. Conceivably, an entire rack of UCS chassis running 56 blades could be driven with only 56 data cables, 28 if only four 10Gb links are required on each chassis.
From there, the pair of FIs are connected to the LAN with some number of 10Gb uplinks, and the remainder of the ports on the FI are used to connect to the chassis. A pair of FIs can drive 18 chassis at 40Gb per chassis with two 10Gb uplinks to the datacenter LAN, allowing for eight 4Gb FC connections to a SAN from an eight-port FC expansion card.
The basis of the UCS configuration is the DME (Data Management Engine), a memory-based relational database that controls all aspects of the solution. It is itself driven by an XML API that is wide open. Everything revolves around this API, and it's quite simple to script interactions with the API to monitor or perform every function of UCS. In fact, the GUI and the CLI are basically shells around the XML configuration, so there's no real disparity between what can and can't be done with the CLI and GUI, or even external scripts. UCS is a surprisingly open and accessible system. Following that tenet, backing up the entirety of a UCS configuration is simple: The whole config can be sent to a server via SCP, FTP, SFTP, or TFTP, although this action cannot be scheduled through the GUI or CLI.
The initial setup of a UCS installation takes about a minute. Through the console, an IP is assigned to the out-of-band management interface on the initial FI, and a cluster IP is assigned within the same subnet. A name is given to the cluster, admin passwords are set, and that's about it. The secondary FI will detect the primary and require only an IP address to join the party. Following that, pointing a browser at the cluster will provide a link to the Java GUI, and the UCS installation is ready for configuration.
To be frank, the features, scope, and breadth of the UCS offering is quite impressive for a 1.0 release. That's not to say there aren't problems. For one thing, it's not terribly clear when changes made to service profiles will cause a blade to reboot. In some instances, warnings are issued when configuration changes may cause a blade to reboot, but otherwise the state of a blade is somewhat opaque.
We encountered a few minor GUI problems and one more significant glitch: During one service profile push, the PXE blade prep boot didn't happen. A manual reboot of the blade through the KVM console got everything back on the right track, however. Throughout all the buildups and teardowns of the blades, this was the only time that happened.
Of some concern is the fault monitoring aspects of UCS. For instance, when a drive was pulled from a RAID 1 array on a running host, the event failed to throw a fault showing that the drive had failed. However, it did produce a notification that the server was now in violation of the assigned profile because it only had one disk. Further, re-inserting the disk cleared the profile violation, but produced no indication of the RAID set rebuild status. Indeed, there doesn't seem to be a way to get that information anywhere aside from a reboot and entry into the RAID controller BIOS, which is somewhat troubling. Cisco has filed a bug related to this problem and expects it to be fixed in an upcoming release.
A minor consideration is that, while Cisco is agnostic as to the make of the FC SAN attached to UCS, it must support NPIV (N_Port ID Virtualization). Most modern FC SANs shouldn't have a problem with this, but it is an absolute requirement.
Finally, there's the matter of cost. In keeping with all things Cisco, UCS isn't terribly cheap. Unless you're planning on deploying at least three chassis, it may not be worth it. The reason for this is that the chassis are relatively affordable, but the FIs and associated licenses are not. However, the scalability inherent in the UCS design means that you can fit a whole lot of blades on two FIs, so as you expand with chassis and blades, the investment comes back in spades. A well-equipped redundant UCS configuration with 32 dual-CPU Nehalem E5540-based blades with local SAS drives and 48GB of RAM each costs roughly $338,000. But adding another fully equipped chassis costs only $78,000, nearly half the price of a traditional blade chassis with similar specs.
We certainly found some problems with UCS, but they float well above the foundation, which is equally impressive for its manageability, scalability, and relative simplicity. There's a whole lot to like about UCS, and the statement it makes just might cause that revolution