The Wisdom of Clouds: August 2008

Saturday, August 30, 2008

Elements of a Cloud Oriented Architecture

In my post, The Principles of Cloud Oriented Architectures, I introduced you to the concept of a software system architecture designed with "the cloud" in mind:

"...I offer you a series of posts...describing in depth my research into what it takes to deliver a systems architecture with the following traits:
It partially or entirely incorporates the clouds for at least one layer of the Infrastructure/Platform/Application stack.
Is focused on consumers of cloud technologies, not the requirements of those delivering cloud infrastructures, either public or private (or even dark).
Takes into account a variety of technical, economic and even political factors that systems running in the "cloud" must take into account.
Is focused at least as much on the operational aspects of the system as the design and development aspects
The idea here is not to introduce an entirely new paradigm--that's the last thing we need given the complexity of the task ahead of us. Nor is it to replace the basic principles of SOA or any other software architecture. Rather, the focus of this series is on how to best prepare for the new set of requirements before us."

I followed that up with a post (well, two really) that set out to define what our expectations of "the cloud" ought to be. The idea behind the Cloud Computing Bill of Rights was not to lay out a policy platform--though I am flattered that some would like to use it as the basis of one-- but rather to set out some guidelines about what cloud computing customers should anticipate in their architectures. In this continuing "COA principles" series, I intend to lay out what can be done to leverage what vendors deliver, and design around what they fail to deliver.

With that basic framework laid out, the next step is to break down what technology elements need to be considered when engineering for the cloud. This post will cover only the list of some such elements as I understand them today (and feel free to use the comments below to add your own insights), and future posts will provide a more thorough analysis of individual elements and/or related groups of elements. The series is really very "stream of consciousness", so don't expect too much structure or continuity.

When considering what elements matter in a Cloud Oriented Architecture, we consider first that we are talking about distributed systems. Simply utilizing Salesforce.com to do your Customer Relationship Management doesn't require an architecture; integrating it with your SAP billing systems does. As your SAP systems most likely don't run in Salesforce.com data centers, the latter is a distributed systems problem.

Most distributed systems problems have just a few basic elements. For example:

Distribution of responsibilities among component parts
Dependency management between those component parts
Scalability and reliability
- Of the system as a whole
- Of each component
Data Access and Management
Communication and Networking
Monitoring and Systems Management

However, because cloud computing involves leveraging services and systems entirely outside of the architect's control, several additional issues must be considered. Again, for example:

How are the responsibilities of a complex distributed system best managed when the services being consumed are relatively fixed in the tasks they can perform?
How are the cloud customer's own SLA commitments best addressed when the ability to monitor and manage components of the system may be below the standards required for the task?
How are the economics of the cloud best leveraged?
- How can a company gain the most work for the least amount of money?
- How can a company leverage the cloud marketplace for not just cost savings, but also increased availability and system performance?

In an attempt to address the more cloud-specific distributed systems architecture issues, I've come up with the following list of elements to be addressed in a typical Cloud Oriented Architecture:

Service Fluidity - How does the system best allow for static redeployment and/or "live motion" of component pieces within and across hardware, facility and network boundaries? Specific issues to consider here include:
- Distributed application architecture, or how is the system designed to manage component dependencies while allowing the system to dynamically find each component as required? (Hint: this problem has been studied thoroughly by such practices as SOA, EDA, etc.)
- Network resiliency, or how does the system respond to changes in network location, including changes in IP addressing, routing and security?
Monitoring - How is the behavior and effectiveness of the system measured and tracked both to meet existing SLAs, as well as to allow developers to improve the overall system in the future? Issues to be considered here include:
- Load monitoring, or how do you measure system load when system components are managed by multiple vendors with little or know formal agreements of how to share such data with the customer or each other?
- Cost monitoring, or how does the customer get a accurate accounting of the costs associated with running the system from their point of view?
Management - How does the customer configure and maintain the overall system based on current and ongoing technical and business requirements? Examples of what needs to be considered here includes:
- Cost, or what adjustments can be made to the system capacity or deployment to provide the required amount of service capacity at the lowest cost possible? This includes ways to manage the efficiency of computation, networking and storage.
- Scalability, or how does the system itself allow changes to capacity to meet required workloads? These changes can happen:
  - vertically (e.g. get a bigger box for existing components--physically or virtually)
  - horizontally (e.g. add or remove additional instances of one or more components as required)
  - from a network latency perspective (adjust the ways in which the system accesses the network in order to increase overall system performance)
- Availability, or how does the system react to failure or any one component, or any group of components (e.g. when an entire vendor cloud goes offline)?
Compliance - How does the overall system meet organizational, industry and legislative regulatory requirements--again, despite being made up of components from a variety of vendors who may themselves provide computing in a variety of legal jurisdictions?

Now comes the fun of breaking these down a bit, and talking about specific technologies and practices that can address them. Please, give me your feedback (or write up your criticism on your own blog, but link here so I can find you). Point me towards references to other ways to think about the problem. I look forward to the conversation.

Tuesday, August 26, 2008

Off Topic: Love DCK's New Look!!!

Congratulations to Rich Miller and the team at Data Center Knowledge for their brilliant new look. I hate to say it, but the old design was clunky and behaved in strange ways for me. So far, the new look is easier on the eyes, better organized and has a few more features. Oh, sure, it also has many more ads, but a guy's got to pay for a site redesign somehow! :-)

Well done, DCK, now get back to reporting the data center market!

Saturday, August 23, 2008

Update: The Cloud Computing Bill of Rights

Thanks to all that provided input on the first revision of the Cloud Computing Bill of Rights. The feedback has been incredible, including several eye opening references, and some basic concepts that were missed the first time through. An updated "CCBOR" is below, but I first want to directly outline the changes, and credit those that provided input.

Abhishek Kumar points out that government interference in data privacy and security rights needs to be explicitly acknowledged. I hear him loud and clear, though I think the customer can expect only that laws will remain within the constitutional (or doctrinal) bounds of their particular government, and that government retains the right to create law as it deems necessary within those parameters.
What must also be acknowledged, however, is that customers have the right to know exactly what laws are in force for the cloud systems they choose to use. Does this mean that vendors should hire civil rights lawyers, or that the customer is on their own to figure that out? I honestly don't know.
Peter Laird's "The Good, Bad, and the Ugly of SaaS Terms of Service, Licenses, and Contracts" is a must read when it comes to data rights. It finds for enterprises what was observed by NPR the other night for individuals; that you have very few data privacy rights right now, that your provider probably has explicit provisions protecting them and exposing you or your organization, and the cloud exposes risks that enterprises avoid by owning their own clouds.
This reinforces the notion that we must understand that privacy is not guaranteed in the cloud, no matter what your provider says. As Laird puts it:
"...[A] customer should have an explicit and absolute right to data ownership regardless of how a contract is terminated."
Ian Osbourne asks "should there be a right to know where the data will be stored, and potentially a service level requirement to limit host countries?" I say absolutely! It will be impossible for customers to obey laws globally unless data is maintained in known jurisdictions. This was the catalyst for the "Follow the Law Computing" post. Good catch!
John Marsh of GeekPAC links to his own emerging attempt at a Bill of Rights. In it, he points out a critical concept that I missed:
"[Vendors] may not terminate [customer] account[s] for political statements, inappropriate language, statements of sexual nature, religious commentary, or statements critical of [the vendor's] service, with exceptions for specific laws, eg. hate speech, where they apply."
Bravo, and noted.
Unfortunately, the federal courts have handed down a series of rulings that challenge the ability of global citizens and businesses to do business securely and privately in the cloud. This Bill of Rights is already under grave attack.

Below is the complete text of the second revision of the Cloud Computing Bill of Rights. Let's call the first "CCBOR 0.1" and this one "CCBOR 0.2". I'll update the original post to reflect the versioning.

One last note. My intention in presenting this post was not to become the authority on cloud computing consumer rights. It is, rather, the cornerstone of my Cloud Computing Architecture discussion, in which I need to move on to the next point. I'm working on setting up a WIKI for this "document". Is there anyone out there in particular that would like to host it?

The Cloud Computing Bill of Rights (0.2)

In the course of technical history, there exist few critical innovations that forever change the way technical economies operate; forever changing the expectations that customers and vendors have of each other, and the architectures on which both rely for commerce. We, the parties entering into a new era driven by one such innovation--that of network based services, platforms and applications, known at the writing of this document as "cloud computing"--do hereby avow the following (mostly) inalienable rights:

Article I: Customers Own Their Data

No vendor shall, in the course of its relationship with any customer, claim ownership of any data uploaded, created, generated, modified, hosted or in any other way associated with the customer's intellectual property, engineering effort or media creativity. This also includes account configuration data, customer generated tags and categories, usage and traffic metrics, and any other form of analytics or meta data collection.
Customer data is understood to include all data directly maintained by the customer, as well as that of the customer's own customers. It is also understood to include all source code and data related to configuring and operating software directly developed by the customer, except for data expressly owned by the underlying infrastructure or platform provided by the vendor.
Vendors shall always provide, at a minimum, API level access to all customer data as described above. This API level access will allow the customer to write software which, when executed against the API, allows access to any customer maintained data, either in bulk or record-by-record as needed. As standards and protocols are defined that allow for bulk or real-time movement of data between cloud vendors, each vendor will endeavor to implement such technologies, and will not attempt to stall such implementation in an attempt to lock in its customers.
Customers own their data, which in turn means they own responsibility for the data's security and adherence to privacy laws and agreements. As with monitoring and data access APIs, vendors will endeavor to provide customers with the tools and services they need to meet their own customers' expectations. However, customers are responsible for determining a vendor's relevancy to specific requirements, and to provide backstops, auditing and even indemnification as required by agreements with their own customers.
Ultimately, however, governments are responsible for the regulatory environments that define the limits of security and privacy laws. As governments can choose any legal requirement that works within the constraints of their own constitutions or doctrines, customers must be aware of what may or may not happen to their data in the jurisdictions in which data resides, is processed or is referenced. As constitutions vary from country to country, it may not even be required for governments to inform customers what specific actions are taken with or against their data. That laws exist that could put their data in jeopardy, however, is the minimum that governments convey to the market.
Customers (and their customers) must leverage the legislative mechanisms of any jurisdiction of concern to change those parameters.
In order for enough trust to be built into the online cloud economy, however, governments should endeavor to build a legal framework that respects corporate and individual privacy, and overall data security. While national security is important, governments must be careful not to create an atmosphere in which the customers and vendors of the cloud distrust their ability to securely conduct business within the jurisdiction, either directly or indirectly.
Because regulatory effects weigh so heavily on data usage, security and privacy, vendors shall, at a minimum, inform customers specifically where their data is housed. A better option would be to provide mechanisms by which users can choose where their data will be stored. Either way, vendors should also endeavor to work with customers to assure that their systems designs do not conflict with known legal or regulatory obstacles. This is assumed to apply to primary, backup and archived data instances.

Article II: Vendors and Customers Jointly Own System Service Levels

Vendors own, and shall do everything in their power to meet, service level targets committed to with any given customer. All required effort and expense necessary to meet those explicit service levels will be spent freely and without additional expense to the customer. While the specific legally binding contracts or business agreements will spell out these requirements, it is noted here that these service level agreements are entered into expressly to protect both the customer's and vendor's business interests, and all decisions by the vendor will take both parties equally into account.
Where no explicit service level agreement exists with a customer, the vendor will endeavor to meet any expressed service level targets provided in marketing literature or the like. At no time will it be acceptable for a vendor to declare a high level of service at a base price, only to later indicate that that level of service is only available at a higher premium price.
It is perfectly acceptable, however, for a vendor to expressly sell a higher level of service at a higher price, as long as they make that clear at all points where a customer may evaluate or purchase the service.
Ultimately, though, customers own their service level commitments to their own internal or external customers, and the customer understands that it is their responsibility to take into account possible failures by each vendor that they do business with.
Customers relying on a single vendor to meet their own service level commitments enter into an implicit agreement to tie their own service level commitments to the vendor's, and to live and die by the vendor's own infrastructure reliability. Those customers who take their own commitments seriously will seek to build or obtain independent monitoring, failure recovery and disaster recovery systems.
Where customer/vendor system integration is necessary, the vendor's must offer options for monitoring the viability of that integration at as many architectural levels as required to allow the customer to meet their own service level commitments. Where standards exist for such monitoring, the vendor will implement those standards in a timely and complete fashion. The vendor should not underestimate the importance of this monitoring to the customer's own business commitments.
Under no circumstances will vendors terminate customer accounts for political statements, inappropriate language, statements of sexual nature, religious commentary, or statements critical of the vendor's service, with exceptions for specific laws, e.g. hate speech, where they apply.

Article III: Vendors Own Their Interfaces

Vendors are under no obligation to provide "open" or "standard" interfaces, other than as described above for data access and monitoring. APIs for modifying user experience, frameworks for building extensions or even complete applications for the vendor platform, or such technologies can be developed however the vendor sees fit. If a vendor chooses to require developers to write applications in a custom programming language with esoteric data storage algorithms and heavily piracy protected execution systems, so be it.
If it seems that this completely abdicates the customer's power in the business relationship, this is not so. As the "cloud" is a marketplace of technology infrastructures, platforms and applications, the customer exercises their power by choosing where to spend their hard earned money. A decision to select a platform vendor that locks you into proprietary Python libraries, for instance, is a choice to support such programming lock-in. On the other hand, insistence on portable virtual machine formats will drive the market towards a true commodity compute capacity model.
The key reason for giving vendors such power is to maximize innovation. By restricting how technology gets developed or released, the market risks restricting the ways in which technologists can innovate. History shows that eventually the "open" market catches up to most innovations (or bypasses them altogether), and the pace at which this happens is greatly accelerated by open source. Nonetheless, forcing innovation through open source or any other single method runs the risk of weakening capitalist entrepreneurial risk taking.
The customer, however, has the right to use any method legally possible to extend, replicate, leverage or better any given vendor technology. If a vendor provides a proprietary API for virtual machine management in their cloud, customers (aka "the community", in this case) have every right to experiment with "home grown" implementations of alternative technologies using that same API. This is also true for replicating cloud platform functionality, or even complete applications--though, again, the right only extends to legal means.
Possibly the best thing a cloud vendor can do to extend their community, and encourage innovation on their platform from community members is to open their platform as much as possible. By making themselves the "reference platform" for their respective market space, an open vendor creates a petrie dish of sorts for cultivating differentiating features and successes on their platform. Protective proprietary vendors are on their own.

These three articles serve as the baseline for customer, vendor and, as necessary, government relationships in the new network-based computing marketplace. No claim is made that this document is complete, or final. These articles may be changed or extended at any time, and additional articles can be declared, whether in response to new technologies or business models, or simply to reflect the business reality of the marketplace. It is also a community document, and others are encouraged to bend and shape it in their own venues.

Friday, August 22, 2008

Why Every Linux Application Known To Man Will Be SaaS Soon

With its big (and well covered) announcement yesterday that it will now provide fixed storage for EC2 instances, Amazon has opened the door for just about anybody with Linux-friendly applications to build a cloud-oriented business. Amazon's Elastic Block Store (EBS) isn't revolutionary in itself--lots of cloud vendors have had block storage for some time (see GoGrid, for instance). However, because its Amazon, and because they understand online commerce better than most, it's just became incredibly simple for an application to be offered in the cloud.

So sit back and enjoy the flood of press releases that are sure to appear over the next 3-6 months.

The core reasons for this go beyond just EBS. A combination of one or more Amazon Machine Images (AMIs), DevPay, EBS, S3 and Premium Support creates a "stack" every bit as important to the cloud as LAMP is to web applications. Create a standard image, deploy it in one or more standard EC2 instances, back it up to S3 and charge for it using DevPay. Support your customer SLAs in partnership through your Premium Services account, and you are every bit as solid a SaaS player as anyone else. Oh, and you likely did it without multitenancy.

Unfortunately its a wholly proprietary stack (built largely on open source), but its a stack nonetheless.

Why would any small or medium sized business with software they want to provide by subscription over the web choose any other option (especially given the success rate of start ups on EC2, as reported by Amazon in a recent post)?

There is, of course, another audience that should care about EBS and its effect on the market place: data center operators.

More resources on EBS:

Wednesday, August 20, 2008

Was "Cloud Computing" Common Before Dell Made A Grab For It?

Cameron Christoffers sympathizes with Dell regarding their argument that the term "cloud computing" was not in wide use when they made their trademark application in March, 2007. That may be so; my own first use of the word "cloud" was in August of that year, but I had referenced an article in ZDNet from August, 2006, that directly quoted Eric Schmidt of Google:

"It starts with the premise that the data services and architecture should be on servers. We call it cloud computing – they should be in a ‘cloud’ somewhere. And that if you have the right kind of browser or the right kind of access, it doesn’t matter whether you have a PC or a Mac or a mobile phone or a BlackBerry or what have you – or new devices still to be developed – you can get access to the cloud…"

Clearly this is evidence that Google used the term widely within their ranks and with their customers, and that a solid argument could be make that the term was in "wide use" among the core techies that cared about the technology elsewhere. The rest of the market didn't care, but it was certainly not a term or slogan that was uniquely invented by Dell.

Still. No wonder the USPTO missed the "commonality" of the term initially.

Saturday, August 16, 2008

The Cloud Computing Bill of Rights

Update: Title and version number added before Cloud Computing Bill of Rights text below.

Before you architect your application systems for the cloud, you have to set some ground rules on what to expect from the cloud vendors you either directly or indirectly leverage. It is important that you walk into these relationships with certain expectations, in both the short and long term, and both those that protect you and those that protect the vendor.

This post is an attempt to capture many of the core rights that both customers and vendors of the cloud should come to expect, with the goal of setting that baseline for future Cloud Oriented Architecture discussions.

This is but a first pass, presented to the community for feedback, discussion, argument and--if deserved--derision. Your comments below will be greatly appreciated in any case.

The Cloud Computing Bill of Rights (0.1)

In the course of technical history, there exist few critical innovations that forever change the way technical economies operate; forever changing the expectations that customers and vendors have of each other, and the architectures on which both rely for commerce. We, the parties entering into a new era driven by one such innovation--that of network based services, platforms and applications, known at the writing of this document as "cloud computing"--do hereby avow the following (mostly) inalienable rights:

Article I: Customers Own Their Data

No vendor shall, in the course of its relationship with any customer, claim ownership of any data uploaded, created, generated, modified, hosted or in any other way associated with the customer's intellectual property, engineering effort or media creativity. This also includes account configuration data, customer generated tags and categories, usage and traffic metrics, and any other form of analytics or meta data collection.
Customer data is understood to include all data directly maintained by the customer, as well as that of the customer's own customers. It is also understood to include all source code and data related to configuring and operating software directly developed by the customer, except for data expressly owned by the underlying infrastructure or platform provided by the vendor.
Vendors shall always provide, at a minimum, API level access to all customer data as described above. This API level access will allow the customer to write software which, when executed against the API, allows access to any customer maintained data, either in bulk or record-by-record as needed. As standards and protocols are defined that allow for bulk or real-time movement of data between cloud vendors, each vendor will endeavor to implement such technologies, and will not attempt to stall such implementation in an attempt to lock in its customers.

Article II: Vendors and Customers Jointly Own System Service Levels

Vendors own, and shall do everything in their power to meet, service level targets committed to with any given customer. All required effort and expense necessary to meet those explicit service levels will be spent freely and without additional expense to the customer. While the specific legally binding contracts or business agreements will spell out these requirements, it is noted here that these service level agreements are entered into expressly to protect the customer's business interests, and all decisions by the vendor will take this into account.
Where no explicit service level agreement exists with a customer, the vendor will endeavor to meet any expressed service level targets provided in marketing literature or the like. At no time will it be acceptable for a vendor to declare a high level of service at a base price, only to later indicate that that level of service is only available at a higher premium price.
It is perfectly acceptable, however, for a vendor to expressly sell a higher level of service at a higher price, as long as they make that clear at all points where a customer may evaluate or purchase the service.
Ultimately, though, customers own their service level commitments to their own internal or external customers, and the customer understands that it is their responsibility to take into account possible failures by each vendor that they do business with.
Customers relying on a single vendor to meet their own service level commitments enter into an implicit agreement to tie their own service level commitments to the vendor's, and to live and die by the vendor's own infrastructure reliability. Those customers who take their own commitments seriously will seek to build or obtain independent monitoring, failure recovery and disaster recovery systems.
Where customer/vendor system integration is necessary, the vendor's must offer options for monitoring the viability of that integration at as many architectural levels as required to allow the customer to meet their own service level commitments. Where standards exist for such monitoring, the vendor will implement those standards in a timely and complete fashion. The vendor should not underestimate the importance of this monitoring to the customer's own business commitments.

Article III: Vendors Own Their Interfaces

Vendors are under no obligation to provide "open" or "standard" interfaces, other than as described above for data access and monitoring. APIs for modifying user experience, frameworks for building extensions or even complete applications for the vendor platform, or such technologies can be developed however the vendor sees fit. If a vendor chooses to require developers to write applications in a custom programming language with esoteric data storage algorithms and heavily piracy protected execution systems, so be it.
If it seems that this completely abdicates the customer's power in the business relationship, this is not so. As the "cloud" is a marketplace of technology infrastructures, platforms and applications, the customer exercises their power by choosing where to spend their hard earned money. A decision to select a platform vendor that locks you into proprietary Python libraries, for instance, is a choice to support such programming lock-in. On the other hand, insistence on portable virtual machine formats will drive the market towards a true commodity compute capacity model.
The key reason for giving vendors such power is to maximize innovation. By restricting how technology gets developed or released, the market risks restricting the ways in which technologists can innovate. History shows that eventually the "open" market catches up to most innovations (or bypasses them altogether), and the pace at which this happens is greatly accelerated by open source. Nonetheless, forcing innovation through open source or any other single method runs the risk of weakening capitalist entrepreneurial risk taking.
The customer, however, has the right to use any method legally possible to extend, replicate, leverage or better any given vendor technology. If a vendor provides a proprietary API for virtual machine management in their cloud, customers (aka "the community", in this case) have every right to experiment with "home grown" implementations of alternative technologies using that same API. This is also true for replicating cloud platform functionality, or even complete applications--though, again, the right only extends to legal means.
Possibly the best thing a cloud vendor can do to extend their community, and encourage innovation on their platform from community members is to open their platform as much as possible. By making themselves the "reference platform" for their respective market space, an open vendor creates a petrie dish of sorts for cultivating differentiating features and successes on their platform. Protective proprietary vendors are on their own.

These three articles serve as the baseline for customer/vendor relationships in the new network-based computing marketplace. No claim is made that this document is complete, or final. These articles may be changed or extended at any time, and additional articles can be declared, whether in response to new technologies or business models, or simply to reflect the business reality of the marketplace. It is also a community document, and others are encouraged to bend and shape it in their own venues.

Comments, complaints or questions can be directed to the author through the comments section below.

Friday, August 15, 2008

"Cloud Computing" Set Free: Dell Application Denied

Sam Johnston--who gets all the credit for breaking the Dell "Cloud Computing" trademark application story, in my opinion--is reporting that the USPTO has sent a "non-final action" to Dell, according to their web site. This is great news. Perhaps the safeguards built into these application processes aren't as bad as some suspected as this story unfolded.

By the way, I may not agree with Sam regarding the pure definition of cloud computing, but I he continues to impress as a blogger. Between Sam, Markus Klems, Kevin Jackson and the Google Groups: Cloud Computing list--not to mention my Google Alerts on "cloud computing"--I find myself constantly reading "dang, I wish I wrote that" posts from new voices these days. These are interesting times, indeed.

Thursday, August 14, 2008

Are We Overselling the Cloud to Ourselves?

I was doing some casual reading tonight (which is all I have time to do lately, it seems), when I came across this post from Thomas Wailgum of CIO.com on InfoWorld. (Ain't syndication grand?) The majority of the post is commentary from Gartner about the relative infancy of SaaS ERP solutions relative to their on-premises bretheren. Interesting in and of itself, but not normally worthy of a post here.

However, on the second page, I came across the following quote:

Other inhibitors to more widespread SaaS ERP adoption, Ganly contends, relate to total cost of ownership (TCO). TCO of "SaaS ERP suites likely will be significant and may not compare favorably with on-premises solutions," she adds. This problem applies to vendors as well. SaaS vendors "often have unrealistic expectations of their operating costs," she writes. "The multitenant architecture needed for SaaS ERP suites results in high internal efforts and costs for the initial setup and the ongoing maintenance and upgrade of the system."
Security has also been an issue with SaaS ERP offerings, "especially with regard to financial data and privacy concerns," Ganly writes. "Vendors must prove to organizations that are considering SaaS ERP adoption that their security and privacy concerns are unfounded through super low-cost or no-cost, proof-of-concept trials, encouraging early adoption through value pricing and getting early adopters to share their success stories."
[Emphasis mine.]

It occurs to me that this is a really good point to consider when looking at the economics of the cloud computing market. For SaaS vendors, cost-of-sales is still high, as the sale is (and always will be) a hybrid of the traditional enterprise sales model: high investment in building customer relationships, proving technical and business feasibility, and navigating corporate politics, though likely with fewer of the "big meeting" costs found in traditional relationship sales.

Thus, the "economies of scale" from data center operations may be vastly overshadowed by cost of acquiring customers.

However, a "pure infrastructure" play (such as poster child Amazon), eliminates most of the cost-of-sales if they can prove low barrier to entry and significant flexibility of use. Most customers discover Amazon, get set up for free, then pay nominal charges to figure out for themselves how to use the platform. There is no real data lockin, as the storage services are essentially device storage (as opposed to specific data schemas), so the cost of choosing not to move forward with a pure IaaS vendor is relatively low.

There are few CxO level relationships between AWS and their customers (though I don't doubt there are several with, say, financial services megoliths with deep pockets and an interest in influencing AWS).

The point is, when most technologists think of the cloud, they think of something like Amazon, not something like DemandERP. But much of the value of the cloud comes from getting the resources you need in (usually) an on demand model. If price and experience can't be both superior to on-premises ERP for the customer, and profitable for the vendor...well, as the kids say these days, "fail".

I worry that many of the boutique IaaS vendors are also going to fall into the same trap--not understanding how the cost of acquiring customers to a specialized platform or service will wipe out the economies of scale savings of multitennancy. There will be a lot of churn out there in the coming years, and a lot of wispy corpses floating in the clouds. Caveat emptor.

Oh, and to the point of the efficiency of Amazon's model: Jeff Barr notes that he can't find a failed startup that used EC2/S3 as its core infrastructure:

"One of the major value propositions of Amazon Web Services is the utility pricing plan. That is, you only pay for what you use, and the cost is very low. Sometimes it feels like I am just saying that: not because there is any doubt that it’s true; rather because it’s difficult to produce metrics to back up assertions that “low cost utility pricing” is truly a game changer.
Then it hit me… Looking at the list of Start-Up Project presentations on Slideshare’s site, I realized that not a single one of these companies is “off the air”; that is, they all are still in business. In the Startup world that is nothing short of amazing—especially in this economy. (Some of the decks on Slideshare's site are not from last year’s startup events; however even those other companies appear to be alive and well.) Amazon can’t take all the credit for this track record; however it does seem to be a solid data point that validates the value proposition."

That is amazing, if it holds true.

Monday, August 04, 2008

Is a Grid a Cloud? Probably not, but...

Sam Johnston has recently been writing some very provocative posts (provocative as in "thought producing" as well as, at times, "controversial"). One of his latest is his missive on cloud computing, and the confusion created by vendors pushing their grid platforms as defining cloud computing.

He has some good points, and I recommend reading the post. However, very early on he makes a statement that I think clearly demonstrates his own flawed logic when it comes to the term "private cloud". In the first paragraph, he says:

"Some of this confusion is understandable given issues get complex quickly when you start peeling off the layers, however much of it comes from the very same opportunistic hardware & software vendors who somehow convinced us years ago that clusters had become grids. These same people are now trying to convince us that grids have become clouds in order to sell us their version of a 'private cloud' (which is apparently any large, intelligent and/or reliable cluster)."

[Emphasis mine.]

There's the root problem, right there. By equating a "private cloud" with "any large, intelligent and/or reliable cluster", he misses much of what the private cloud is--and biases his definition from the point of view of traditional job based grid computing (which does act very much like a cluster).

Let's use my alma mater as an example of a private cloud infrastructure vendor that does not sell a clustering platform--at least not in the traditional sense of the word, as it relates to software. Cassatt does not tie a bunch of servers into a single, interconnected unit for a workload run on top of it. In fact, that remains the job of the software platform deployed into Cassatt, if it is indeed desired. There is no software coordination intelligence in Cassatt today (other than some dependency management to control startup and shutdown).

Cassat works purely at the server and OS level. No, it doesn't create an OS cluster, because the OS isn't aware that it is being managed. All that Cassatt does is pool server resources into a general pool that can be assigned as needed to meet capacity (and reliability) demands as defined by the service levels applied to the software payloads. If Cassatt sees that application A needs more capacity, it grabs another server. If an instance of server B goes down, Cassatt creates a new instance with the same IP address and hostname (if safe) as the original.

Cassatt is not job based. Any running server payload, including web applications, enterprise applications or "always on" monitoring and feed reading processes can be hosted in exactly the same manner as batch jobs. Cassatt doesn't do queueing of jobs, it just provisions servers as needed to meet the service levels defined for business workloads.

Read Cassatt's web site for more. They say it much better than I am expressing it now.

The point is, though, that Cassatt is not a cluster, it is a resource pool, and as such acts much more like a cloud than a grid. Sam may say "well, that's just autonomic computing" and he's right, but the cloud is autonomic. So calling an autonomic system running behind an enterprise firewall a "private cloud" is not much of a stretch at all.

By the way, ksankar of http://doubleclix.wordpress.com notes nine great differences between a grid and a cloud. I think he captured more of my own thinking about this subject in that one post than I've been able to express in the last three years. Worth a read as well.

Finally, subscribe to Sam's blog. He's asking some important questions, and deserves your attention.

Sunday, August 03, 2008

Is Dell desparate, or just defensive?

Ugh...

How else do you react to the news the Dell is most of the way down the road towards trademarking the term "cloud computing"?

The only question I have is "why?". What do they gain from this assault on one of the most explosive marketplaces to find its way into technology since the Internet itself? I see two options (though there are probably more--let me know what you think):

Dell thought at the time of the application they could create technology and a brand around "cloud computing", and they would own the mindshare around the term. They even applied for (and got) the cloudcomputing.com domain. Of course, this aspiration is naive at best, and if this is the case, at this point Dell should kill the application, build a kick-ass site for Dell's vision of cloud computing and call it a day.
They were simply trying to protect the cloudcomputing.com domain by blocking others from getting cloudcomputing.net, cloudcomputing.info, etc. If this is the case, the trademark application is too harsh, and they should use other legal means to protect the domain.

Whatever the reason, kill the application, Dell. Spare yourself becoming the SCO of network computing.

Update: I note that Dell has even displayed the trademark on the term "Cloud Computing Solutions" on their web site, as can be seen in the image below:

Update: Dell is apparently suggesting that the second reason I stated above (or a variation) is why they filed for the trademark. Kill the application, Dell, or make a public pledge that is stronger than "It is not our intention...".

Saturday, August 02, 2008

The Principles of a Cloud Oriented Architecture

The market is hot. The technologies are appearing fast and furious. The tools you need are out there, but they are young, often untested, and always deliver unpredictable reliability. You've researched the economics, and you know now that cloud computing is a) here to stay, and b) offers economic advantages that--if realized--could stretch you IT budget and quite possibly catapult your career.

Now what?

What is often overlooked in the gleeful rush to cloud computing is the difficulty in molding the early technologies in the space into a truly bulletproof (or even bullet-resistant) business infrastructure. You see it all over the Internet; the push and pull between innovation and reliability, the concerns about security, monitoring and control, even the constant confusion over what entails cloud computing, what technologies to select for a given problem, and how to create an enterprise-class business system out of those technologies.

The truth is, cloud computing doesn't launch our technical architectures into the future. It is, at its heart, an economic model that drives the parameters around how you acquire, pay for and scale the infrastructure architectures you already know. Its not a question of changing the required problems to solve when utilizing data centers, just a change to the division of responsibilities amongst yourself, your organization, your cloud providers and the Internet itself.

To this end, I offer you a series of posts (perhaps moving to a WIKI in the near future) describing in depth my research into what it takes to deliver a systems architecture with the following traits:

It partially or entirely incorporates the clouds for at least one layer of the Infrastructure/Platform/Application stack.
Is focused on consumers of cloud technologies, not the requirements of those delivering cloud infrastructures, either public or private (or even dark).
Takes into account a variety of technical, economic and even political factors that systems running in the "cloud" must take into account.
Is focused at least as much on the operational aspects of the system as the design and development aspects

The idea here is not to introduce an entirely new paradigm--that's the last thing we need given the complexity of the task ahead of us. Nor is it to replace the basic principles of SOA or any other software architecture. Rather, the focus of this series is on how to best prepare for the new set of requirements before us.

Think about it. We already deal (or try to deal) with a world in which we don't entirely have control over every aspect of the world our applications live in. If we are software developers, we rely on others to build our servers, configure our networks, provide us storage and weld them all together into a cohesive unit. System administrators are, in large enterprises anyway, specializing in OS/application stacks, networking, storage or system management. (Increasingly you can add facilities and traditional utilities to this list.)

Even when we outsource to others--shifting responsibility for management of parts or all of our IT infrastructure to a vendor--the vendor doesn't have control over significant elements of the end-to-end operations of our applications; namely, the Internet itself. But with outsourcing, we typically turn over entire, intact architecture stacks, with a few, very well bounded integration points to manage (if any) between outsourced systems and locally maintained systems.

The cloud is going to mess this up. I say this not just because the business relationship is different from outsourcing, but also because what you are "turning over" can be a *part* of a system stack. Smugmug outsources storage and job processing, but not the web experience that relies on both. Applications that run entirely on EC2/S3 outsource the entire infrastructure, but not the application development, or even the application system management. (This is why RightScale, Hyperic and others are finding some traction with AWS customers.)

To prepare for a cloud oriented architecture, one understand what responsibilities lie where. So, I'll give you a teaser of what is to come with the short-short version of where I see these responsibilities lying (subject to change as I talk to others, including yourselves if you choose to comment on this post):

The enterprise has responsibility for the following:

Defining the business solution to be solved, the use cases that define that solution, and the functional requirements to deliver those use cases
Evaluating the selection of technical and economic approaches for delivering those functional requirements, and selecting the best combination of the two. (In other words, the best combination may not contain either the best technical or best economic selection, but will outweigh any other combination of the two.)
Owning the service level agreements with the business for the delivery of those use cases. This is critically important. More on this below.

The cloud provider has responsibility for the following:

Delivering what they promised you (or the market) that they would deliver. No more, no less.
Providing you with transparent and honest billing and support services.

The Internet itself is only responsible for providing you with an open, survivability reliable infrastructure for interconnecting the networks you need to run your applications and/or services. There are no promises here about reliability or scalability or even availability. It should be considered a technical wilderness, and treated accordingly.

Now, about SLAs. Your cloud provider does not own your SLAs, you do. They may provide some SLAs that support your own, but they are not to be blamed if you fail to achieve the SLAs demanded of you. If your applications or services fail because the cloud failed, you failed. Given that, don't "outsource" your SLAs, at least not logically. Own them.

In fact, I would argue that the single most important function of a cloud-centric IT shop after getting required business functionality up and running in the first place, is monitoring and actively managing that functionality; switching vendors, if necessary, to continue service at required levels. The one big piece of IT-specific software that should always run in IT data centers, in my opinion, is the NOC infrastructure. (Although, perhaps in this context its more of a Cloud Operations Center, but I hate the resulting acronym for obvious reasons.)

I'll focus more on these responsibilities in future posts. All posts in this series will be tagged "coa principles". Please feel free to provide me feedback in the comments, contact me to review your thoughts on this topic, or simply to send me links that you think I should be aware of. I am also working to find other bloggers who wish to take ownerships of parts of this primer (cloud security, for example) so let me know if you are interested there as well.

I am excited about this. This body of knowledge (or at least the faint traces of knowledge) have been rattling inside my head for some time, and it feels good to finally be sharing them with you.