Monday, August 04, 2008

Is a Grid a Cloud? Probably not, but...

Sam Johnston has recently been writing some very provocative posts (provocative as in "thought producing" as well as, at times, "controversial"). One of his latest is his missive on cloud computing, and the confusion created by vendors pushing their grid platforms as defining cloud computing.

He has some good points, and I recommend reading the post. However, very early on he makes a statement that I think clearly demonstrates his own flawed logic when it comes to the term "private cloud". In the first paragraph, he says:
"Some of this confusion is understandable given issues get complex quickly when you start peeling off the layers, however much of it comes from the very same opportunistic hardware & software vendors who somehow convinced us years ago that clusters had become grids. These same people are now trying to convince us that grids have become clouds in order to sell us their version of a 'private cloud' (which is apparently any large, intelligent and/or reliable cluster)."

[Emphasis mine.]
There's the root problem, right there. By equating a "private cloud" with "any large, intelligent and/or reliable cluster", he misses much of what the private cloud is--and biases his definition from the point of view of traditional job based grid computing (which does act very much like a cluster).

Let's use my alma mater as an example of a private cloud infrastructure vendor that does not sell a clustering platform--at least not in the traditional sense of the word, as it relates to software. Cassatt does not tie a bunch of servers into a single, interconnected unit for a workload run on top of it. In fact, that remains the job of the software platform deployed into Cassatt, if it is indeed desired. There is no software coordination intelligence in Cassatt today (other than some dependency management to control startup and shutdown).

Cassat works purely at the server and OS level. No, it doesn't create an OS cluster, because the OS isn't aware that it is being managed. All that Cassatt does is pool server resources into a general pool that can be assigned as needed to meet capacity (and reliability) demands as defined by the service levels applied to the software payloads. If Cassatt sees that application A needs more capacity, it grabs another server. If an instance of server B goes down, Cassatt creates a new instance with the same IP address and hostname (if safe) as the original.

Cassatt is not job based. Any running server payload, including web applications, enterprise applications or "always on" monitoring and feed reading processes can be hosted in exactly the same manner as batch jobs. Cassatt doesn't do queueing of jobs, it just provisions servers as needed to meet the service levels defined for business workloads.

Read Cassatt's web site for more. They say it much better than I am expressing it now.

The point is, though, that Cassatt is not a cluster, it is a resource pool, and as such acts much more like a cloud than a grid. Sam may say "well, that's just autonomic computing" and he's right, but the cloud is autonomic. So calling an autonomic system running behind an enterprise firewall a "private cloud" is not much of a stretch at all.

By the way, ksankar of http://doubleclix.wordpress.com notes nine great differences between a grid and a cloud. I think he captured more of my own thinking about this subject in that one post than I've been able to express in the last three years. Worth a read as well.

Finally, subscribe to Sam's blog. He's asking some important questions, and deserves your attention.