Friday, November 21, 2008

Do Your Cloud Applications Need To Be Elastic?

I got to spend a few hours at Sys-Con's Cloud Computing Expo yesterday, and I have to say it was most certainly an intellectually stimulating day. Not only was just about every US cloud startup represented in one way or another, but included were an unusual conference session, and a meetup of fans of CloudCamp.

While listening in on a session, I overheard one participant ask how the cloud would scale their application if they couldn't replicate it. This triggered a strong response in me, as I really feel for those that confuse autonomic infrastructures with magic applied to scaling unscalable applications. Let me be clear, the cloud can't scale your application (much, at least) if you didn't design it to be scaled. Period.

However, that caused me to ask myself whether or not an application had to be horizontally scalable in order to gain economically while running in an Infrastructure as a Service (IaaS) cloud. The answer, I think, is that it depends.

Chris FlexFleck of Citrix wrote up a pretty decent two part explanation of this on his blog a few weeks ago. He starts out with some basic costs of acquiring and running 5 Quad-core servers--either on-premises (amortized over 3 years at 5%) or in a colocation data center--against the cost of running equivalent "high CPU" servers 24X7 on Amazon's EC2. The short short of his initial post is that it is much more expensive to run full time on EC2 than it is to run on premises or in the colo facility.

How much more expensive?
  • On-premises: $7800/year
  • Colocation: $13,800/year
  • Amazon EC2: $35,040/year
I tend to believe this reflects the truth, even if its not 100% accurate. First, while you may think "ah, Amazon...that's 10¢ a CPU hour", in point of fact most production applications that you read about in the cloud-o-sphere are using the larger instances. Chris is right to use high CPU instances in his comparison at 80¢/CPU hour. Second, while its tempting to think in terms of upfront costs, your accounting department will in fact spread the capital costs out over several years, usually 3 years for a server.

In the second part of his analysis, however, Chris notes that the cost of the same Amazon instances vary based on the amount of time they are actually used, as opposed to the physical infrastructure that must be paid for whether it is used or not (with the possible exception of power and AC costs). This comes into play in a big way if the same instances are used judiciously for varying workloads, such as the hybrid fixed/cloud approach he uses as an example.

In other words, if you have an elastic load, plan for "standard" variances on-premises, but allow "excessive" spikes in load to trigger instances on EC2, you suddenly have a very compelling case relative to buying enough physical infrastructure to handle excessive peaks yourself. As Chris notes:
"To put some simple numbers to it based on the original example, let's assume that the constant workload is roughly equal to 5 Quadcore server capacity. The variable workload on the other hand peaks at 160% of the base requirement, however it is required only about 400 hours per year, which could translate to 12 hours a day for the month of December or 33 hours per month for peak loads such as test or batch loads. The cost for a premise only solution for this situation comes to roughly 2X or $ 15,600 per year assuming existing space and a 20% factor of safety above peak load. If on the other hand you were able to utilize a Cloud for only the peak loads the incremental cost would be only $1,000. ( Based on Amazon EC2 )
Premise Only
$ 15,600 Annual cost ( 2 x 7,800 from Part 1 )
Premise Plus Cloud
$ 7,800 Annual cost from Part 1
$ 1,000 Cloud EC2 - ( 400 x .8 x 3 )
$ 8,800 Annual Cost Premise Plus Cloud "
The lesson of our story? Using the cloud makes the most sense when you have an elastic load. I would postulate that another option would be a load that is not powered on at full strength 100% of the time. Some examples might include:
  • Dev/test lab server instances
  • Scale-out applications, especially web application architectures
  • Seasonal load applications, such as personal income tax processing systems or retail accounting systems
On the other hand, you probably would not use Infrastructure as a Service today for:
  • That little accounting application that has to run at all times, but has at most 20 concurrent users
  • The MS Exchange server for your 10 person company. (Microsoft's multi-tenant Exchange online offering is different--I'm talking hosting your own instance in EC2)
  • Your network monitoring infrastructure
Now, the managed hosting guys are going to probably jump down my throat with counter arguments about the level of service provided by (at least their) hosting clouds, but my experience is that all of these clouds actually treat self-service as self-service, and that there really is very little difference between do-it-yourself on-premises and do-it-yourself in cloud.

What would change these economics to the point that it would make sense to run any or all of your applications in an IaaS cloud? Well, I personally think you need to see a real commodity market for compute and storage capacity before you see the pricing that reflects economies in favor of running fixed loads in the cloud. There have been a wide variety of posts about what it would take [pdf] to establish a cloud market in the past, so I won't go back over that subject here. However, if you are considering "moving my data center to the cloud", please keep these simple economics in mind.