Thursday, October 23, 2008

Amazon Enhances "The Proto-Cloud"

Big news today, as you've probably already seen. Amazon has announced a series of steps to greatly enhance the "production" nature of its already leading edge cloud computing services, including (quoted directly from Jeff Barr's post on the AWS blog):
  • Linux on Amazon EC2 is now in full production. The beta label is gone.
  • There's now an SLA (Service Level Agreement) for EC2.
  • Microsoft Windows is now available in beta form on EC2.
  • Microsoft SQL Server is now available in beta form on EC2.
  • We plan to release an interactive AWS management console.
  • We plan to release new load balancing, automatic scaling, and cloud monitoring services.
There is some great coverage of the announcement already in the blog-o-sphere, so I won't repeat the basics here. Suffice to say:
  • Removing the beta label removes a barrier to S3/EC2 adoption for the most conservative of organizations.
  • The SLA is interestingly organized to both allow for pockets of outages while promoting global up-time. Make no mistake, though, some automation is required to make sure your systems find the working Amazon infrastructure when specific Availability Zones fail.
  • Oh, wait, they took care of that as well...along with automatic scaling and load balancing.
  • Microsoft is quickly becoming a first class player in AWS, which removes yet another barrier for M$FT happy organizations.
Instead, let me focus in this post on how all of this enhances Amazon's status as the "reference platform" for infrastructure as a service (IaaS). In another post, I want to express my concern that Amazon runs the danger of becoming the "WallMart" of cloud computing.

First, why is it that Amazon is leading the way so aggressively in terms of feature sets and service offerings for cloud computing? Why does it seem that every other cloud provider seems to be catching up to the services being offered by Amazon at any given time? For example:
The answer in all cases is because Amazon has become the default standard for IaaS feature definition--this despite having no user interface of their own (besides command line and REST), and using "special" Linux images (the core Amazon Machine Images) that don't provide root access, etc. The reason for the success in setting the standard here is simple: from the beginning, Amazon has focused on prioritizing feature delivery based on barriers to adoption of AWS, rather than on building the very best of any given feature.

Here's how I see it:
  • In the beginning, there was storage and network access. Enter S3.
  • Then there were virtual servers to do computational tasks. Enter EC2, but with only one server size.
  • Then there were significant complaints that the server size wasn't big enough to handle real world tasks. Enter additional server types (e.g. "Large") and associated pricing
  • Then there was the need for "queryable" data storage. Enter SimpleDB.
  • Somewhere in the preceding time frame, the need for messaging services was identified as a barrier. Enter Amazon Simple Queue Service.
  • Now people were beginning to do serious tasks with EC2/S3/etc., so the issues of geographic placement of data and workloads became more of a concern. (This placement was both for geographic fail over, and to address regulatory concerns.) Enter Availability Zones.
  • Soon after that, delivering content and data between the zones became a serious concern (especially with all of the web start ups leveraging EC2/S3/etc.) Enter the announced AWS Content Delivery Service
  • Throw in there various partnership announcements, including support for MySQL and Oracle.
By this point, hundreds of companies had "production" applications or jobs running on Amazon infrastructure, and it became time to decide how serious this was. In my not-so-humble opinion, the floundering economy, its effects on the Amazon retail business, and the predictions that cloud computing could benefit from a weakened economy fed into the decision that its time to remove the training wheels and leave "beta" status for good. Add an official SLA, remove the "beta" label, and "BAM!", you suddenly have a new "production" business to offset the retail side of the house.

Given that everyone else was playing catchup to these features as they came out (mostly because competitors didn't realize what they needed to do next, as they didn't have the customer base to draw from), it is not surprising that Amazon now looks like they are miles ahead of any competitor when it comes to number of customers and (for cloud computing services) probably revenue.

How do you keep the competitors playing catchup? Add more features. How do you select which features to address next? Check with the customer base to see what their biggest concerns are. This time, the low hanging fruit was the management interface, monitoring, and automation. Oh, and that little Windows platform-thingy.

Now, I find it curious that they've pre-announced the monitoring and management stuff today. Amazon isn't really in the habit of announcing a feature before they go private-beta. However, I think there is some concern that they were becoming the "command-line lover's cloud", and had to show some interest in competing with the likes of VirtualCenter in the mind's eye of system administrators. So, to undercut some perceived competitive advantages from folks like GoGrid and Slicehost, they tell their prospects and customers "just give us a second here and we will do you right".

I think the AWS team has been brilliant, both in terms of marketing and in terms of technology planning and development. They remain the dominant team, in my opinion, though there are certainly plenty of viable alternatives out there that you should not be shy of both in conjunction with and in place of Amazon. Jeff Barr, Werner Vogels and others have proven that a business model that so many other IT organizations failed at miserably could be done extremely well. I just hope they don't get too far ahead of themselves...as I'll discuss separately.