Thursday, February 14, 2008

Latency: Obstacle to cloud computing, or opportunity?

I was challenged in the comments to my Cloud Computing Heats Up post regarding my criticism of pupwhines' post that in turn criticized cloud computing. The anonymous author of the comment thought I was too hard on pupwhines, and wanted to know what my response specifically was to the challenge latency presents to distributed computing. I responded there, but I want to expand a little bit on the topic, as it is indeed important to understand, and backs my contention that there will need to be some software architectural changes made to leverage the cloud system.

(Quick note: I've alluded to this before, but I strongly believe there is no one cloud, but a bunch of siloed clouds today with *some* limited integration between them. More of a frontal system, really.)

Latency is an issue in most IT application environments today. There is no question that "traditional" tiered application design scales well at the processing layer, but has real issues at the data layer. There is simply no easy way to manage a traditional relational database architecture over a widely distributed environment. Pupwhines' contention that joining a table between two SaaS vendor implementations would be disaster is right on. In modern technology terms, it would be insane.

However, this is the disruptive aspect of cloud computing: the architectures you know and love are no longer necessarily best practices in a world where your functionality and capacity is:
  1. not necessarily your own,
  2. not necessarily integrated, and
  3. splayed out across this 5.1×108 km2 rock we live on
There are new technical advances being made today in the companies that already rely on cloud principles (think Google, Amazon, Microsoft, etc.). These advances will change the way you design and deploy software, but they will enable a world where proximity of data means less and less.

In fact, you probably already leverage one of these technical changes: increased bandwidth. Indulge me in some autobiographical narrative to illustrate.

Back in the late '90s, while I was a Senior Principal Consultant with Forte Software, Inc, the legendary(?) distributed application development platform vendor, one of my key roles was advising clients on how to best architect for high performance, high scalability and high availability. Forte was an early service oriented architecture, but it ran on the 10Mb/100Mb networks of the time. Thus, the rule for message passing between components (UI<->service or service<->service) was (in order of priority):
  1. Send as few messages over the network as possible
  2. Send the smallest messages possible

Thus, it was better to send large messages once than many small messages, but you wanted to optimize each message as much as possible.

To this end, best practices was to create data services and to actually deploy these services directly on the database server hardware. It was more important to process the relational mapping of data into the object mapping according to need in a timely fashion--thus avoiding unecessary network traffic--than to divide processing responsibility so that there was no custom application components running on the RDBMS hardware.

Fast forward to the 1G/10G networks of today. From what I am seeing, it is actually considered bad practice to do what I described above. While at Sun, I actually got admonished by a (very competent) manager for suggesting the way around Sun Access Managers horrible performance was to deploy the identity server and database on the same box (with our custom login and registration UIs deployed on separate, horizontally scalable servers). Pure architectural heresy. He was right in many ways: doing so would have put the business logic tier into horizontally locked architecture, but that wasn't his point. "We don't deploy our software on our database servers" was the gist of his argument.

So, faster networks have already changed the so-called "laws of physics" that software architects must design around. Given this, it seems easy to postulate that additional advances in network bandwidth will open additional opportunities for architectural change. In fact, it already has; check out Gigaspaces for a cool (though controversial) alternative to horizontally replicated service architectures.

Will bandwidth really grow at a rate that will make a difference to the current IT generation(s)? Many postulate it has to, even if the core network operators resist. As I noted in my response, Cisco's new Nexus 7000 series is a sign of times to come. Does anyone deny that 40G and 100G networks have the opportunity to change the laws of physics? (Disclaimer: I know just enough about networking to be dangerous, so I may be overstating the case...but change is still clearly on the horizon.)

Even if network bandwidth doesn't change at all, or any additional bandwidth is chewed up by demand at existing rates, there are other software architectural advances that will revolutionize certain kinds of computing. I spoke in my response about MapReduce and its open source implementation, Hadoop. For processing large, distributed data loads, this architecture is eliminating boundaries created by traditional scaleout RDBMS-based approaches. Google has used this approach to tie data from every one of its properties (including acquired properties, such as Blogger) into a single user identity and profile. Talk about a distributed join problem...

(Another quick note: hats off to Google and Yahoo respectively for their work in this space. I know from my past life at Sun what a pain in the whatever this is, and I love the seamlessness I experience on these sites.)

One other major advancement is the increasing sophistication of business integration technologies, from traditional application integration (force.com, boomi.com, BizTalk, Lombardi, etc.) to data integration options (Informatica, Business Objects, etc.) to subscription data propogation techniques. These integration options can allow one to go back to some of the basics I spoke of before: do as much processing as possible on Saas vendor A's infrastructure before sharing the relevant data with vendor B. Not as perfect as a join in many cases, but in a service oriented world, a common, required approach for most.

Perhaps the most important point I want to make today, however, is that today--in the modern IT era--many of these technologies are either future tech or not what was used to build existing applications. Given that, what does an existing datacenter do? Stick to my recommendation; convert your own datacenter into a utility/cloud today, and begin to leverage the maturing compute grid/cloud computing ecosystem as it and your applications mature.

1 comment:

Anonymous said...

Can anyone recommend the top performing Remote Management utility for a small IT service company like mine? Does anyone use Kaseya.com or GFI.com? How do they compare to these guys I found recently: [url=http://www.n-able.com] N-able N-central network health tool
[/url] ? What is your best take in cost vs performance among those three? I need a good advice please... Thanks in advance!