Thursday, May 24, 2007

Service Level Automation Deconstructed: Respond

For the third and last in my series breaking down the three key assumptions behind Service Level Automation, I would like to focus on how SLA environments can control data center configuration in response to service level goal violations. These goal violations and the high level actions to be taken are determined by the analysis capability of the environment. Details of how to accomplish those high level actions, however, are decided and executed by the response function.

Essentially, the response function of an SLA environment is very much like the driver set that your operating system uses to translate high level actions (e.g. "store this file") to device specific actions ("Move head 32 steps to center, find block 4D5EF, etc."). The responsibility here is to provide the interface between the SLA analysis engine and specific standard or proprietary interfaces to everything from server hardware to network switches to operating systems and middleware.

I see the following key interface points in today's environments:

  • Power Controllers/MPDUs: Job 1 of a service level automation environment is providing the resources required to meet the needs of the software environment, and only those resources. Turn those servers on when they are needed, and off when they are not. This includes virtual server hosts. (Examples: DRAC, iLO, RSA II,MPDUs)
  • Operating Systems: Before you shut off that server, make sure you've "gently" shut down its software payload. Well written server payloads for automated environments will both start up and aquire intial state (if any), and shut down while preserving any necessary state without human intervention. However, from a communications perspective, each action starts with the OS. (Examples: Red Hat, SuSE, MS Windows, Sun Solaris)
  • Middleware/Virtualization: It is interesting to note that many software payload components (e.g. an application server or a hypervisor) are both software to be managed, and computing capacity themselves. For example, an application server should be managed to specific service levels relating to its relationship with its host server (e.g. CPU utilization, thread counts, etc.), while also treated as a capacity resource for JavaEE applications and services. As such, these software containers should be managed for their own guest payloads much like a physical server would for the overall server payload. (BEA Weblogic, VMWare ESX, XenSource XenEnterprise)
  • Layer 2 Networking: In order to use a server to meet an application's needs, that server must have access to required networks. True automation requires that switch ports be reconfigured as necessary to ensure access to specifically the VLANs required by the payloads they will represent. (Examples: Cisco 3750, Extreme Summit400)
  • Network Attached Storage (NAS): The beauty of NAS devices is that they can be dynamically attached to a software payload at startup, without requiring any hardware configuration beyond the Layer 2 configuration described above. SAN is also useful (and common), but requires hardware configuration to make work. That complicates the role of automation. Part of the problem is the inconsistent remote configurability of fiber switches, which may be mitigated somewhat with iSCSI. However, NAS is quickly becoming the preferred storage mechanism in large data centers. (Examples: NetApp FAS, Adaptec Snap Server)

Over time, I see the industry adding more and more "drivers" to manage more and more data center (and perhaps desktop) resources. Imagine a world in which each software and/or hardware vendor produced standard SLA drivers for each individual component that makes up your data center environment. Every switch, disk and server; every service, container and OS; even every light bulb and air conditioner are connected to a single service level policy engine in which business policy (including cost of operations) drives automated decisions about their use and disuse.

Its not here yet, but you won't have to wait long...

I will use the label "respond" to tag posts related to response interfaces.

No comments: