Showing posts with label cloud lock-in. Show all posts
Showing posts with label cloud lock-in. Show all posts

Wednesday, July 16, 2008

Watch out for Cisco, kids!

What is the most important enabler of distributed computing architectures, such as cloud oriented architectures? What is the one thing that has to be in ample supply before the other elements of the data center come into play? Is it the number of servers or CPU power available for computing? Is it the size and speed of the disks and network storage devices? Is it the distributed software architectures themselves?

My answer? None of the above. It's network bandwidth, baby, all the way.

Why? Well, let's break down where the costs of distributed systems lay. We all know that CPU capabilities double roughly every couple of years, and we also know that disk I/O slows those CPUs down, but not at the rate that network I/O typically does. When designing distributed systems, you must first be aware of network latency and control traffic between components to have any chance in heck of meeting rigorous transaction rate demands. The old rule at Forte Software, for what it's worth, was:

  • First reduce the number of messages as much as possible
  • Then reduce the size of those messages as much as possible
Increase adherence to those rules, and your software would outperform less optimized applications every time. It was easy to look like a performance tuning genius in those days.

What is exciting about today's environment, however, is that network technology is changing rapidly. Bandwidth speeds are increasing quickly (though not as fast as CPU speeds), and this high speed bandwidth is becoming more ubiquitous world wide. Inter-data-center speeds are increasingly mind boggling, and WAN optimization apparently has removed much of the fear of moving real-time traffic between geographically disparate environments.

All of this is a huge positive to cloud oriented architectures. When you design for the cloud, you want to focus on a few key things:
  • Software fluidity - The ability of the software to run cleanly in a dynamic infrastructure, where the server, switch port, storage and possibly even the IP address changes day by day or minute by minute.

  • Software optimization - Because using a cloud service costs money, whether billed by the CPU hour, the transaction or the number of servers used, you want to be sure you are getting your money's worth when leveraging the cloud. That means both optimizing the execution profile of your software, and the use of external cloud services by the same software.

  • Scalability - This is well established, but clearly your software must be able to scale to your needs. Ideally, it should scale infinitely, especially in environments with highly unpredictable usage volume (such as the Internet).

Achieving any of these in an environment where your network bandwidth is constricting your options is nearly impossible.

Oh, and one more thing. The network is the first element of your data center that sees load, failure and service level compliance. Think about it--without the eyes of the network, all of your other data center elements become black boxes (though often physically with those annoying beeps and little blinking orange lights). What are the nerves in the data center nervous system? Network cables, I would say.

Today I saw two really good posts about possible network trends driven by the cloud, and how Cisco's new workhorse leverages "virtualized" bandwidth and opens the door to commodity cloud capacity. The first is a post by Douglas Gourlay of Cisco, which simply looks at the trends that got us to where we are today, and further trends that will grease the skids for commodity clouds. I am especially interested in the following observations:
"8) IP Addressing will move to IPv6 or have IPv4 RFCs standardized that allow for a global address device/VM ID within the addressing space and a location/provider sensitive ID that will allow for workload to be moved from one provider to another without changing the client’s host stack or known IP address ‘in flight’. Here’s an example from my friend Dino.

9) This will allow workload portability between Enterprise Clouds and Service Provider Clouds.

10) The SP community will embrace this and start aggressively trying to capture as much footprint as possible so they can fill their data centers to near capacity allowing for them to have the maximum efficiency within their operation. This holds to my rule that ‘The Value of Virtualization is compounded by the number of devices virtualized’.

11) Someone will write a DNS or a DNS Coupled Workload exchange. This will allow the enterprise to effectively automate the bidding of workload allocation against some number or pool of Service Providers who are offering them the compute, storage, and network capacity at a given price. The faster and more seamless the above technologies make the shift of workload from one provider to another the simpler it is in the end for an exchange or market-based system to be the controlling authority for the distribution of workload and thus $$$’s to the provider who is most capable of processing the workload."

The possibility that IP addresses could successfully travel with their software payloads is incredibly powerful to me, and I think would change everything for both "traditional" VM users, as well as the virtual appliance world. The possibility that my host name could travel with my workload, even as it is moved in real time from one vendor to another is, of course, cloud computing nirvana. To see someone who obviously knows something about networking and networking trends spell out this possibility got my attention.

(Those who see a fatal flaw in Doug's vision are welcome to point it out in the comments section below, or on Doug's blog.)

The second post is from Hurwitz analyst, Robin Bloor, who describes in brilliant detail why Cisco's Nexus 7000 series is different, and why it could very well take over the private cloud game. As an architecture, it essentially makes the network OS the policy engine for controlling provisioning and load balancing, though with bandwidth speeds that blow away today's standards (10G today, but room for 40G and 100G standards in the future). Get to those speeds, and all of a sudden something other than network bandwidth is your restricting function in scaling a distributed application.

I have been cautiously excited about the Nexus announcement from the start. Excited because the vision of what Nexus will be is so compelling to me, for all of the reasons I describe above. (John Chambers, CEO of Cisco, communicates that vision in a video that accompanied the Nexus 5000 series launch.) Cautious, because it reeks of old-school enterprise sales mentality, with Cisco hoping to "own" whole corporate IT departments by controlling both how software runs, and what hardware and virtualization can be bought to run it on. Lock-in galore, and something the modern, open source aware corporate world may be a little uneasy about.

That being said, as Robin put it, "In summary: The network is a computer. And if you think that’s just a smart-ass bit of word play: it’s not."

Robin further explains Cisco's vision as follows:

"Cisco’s vision, which can become reality with the Nexus, is of a data center that is no longer defined by computer architecture, but by network architecture. This makes sense on many levels. Let’s list them in the hope of making it easier to understand.

  1. Networks have become so fast that in many instances it is practical to send the the data to the program, or to send the program to the data, or to send both the program and the data somewhere else to execute. Software architecture has been about keeping data and process together to satisfy performance constraints. Well Moore’s Law reduced the performance issue and Metcalfe’s Law opened up the network. All the constraints of software architecture reduced and they continue to reduce. Distributing both software and data becomes easier by the year.
  2. Software is increasingly being delivered as a service that you connect to. And if it cannot deliver the right performance characteristics in the place where it lives, you move it to a place where it can.
  3. Increasingly there is more and more intelligence being placed on the switch or on the wire. Of course Cisco has been adding intelligence to the switch for years. Those Cisco firewalls and VPNs were exactly that. But also, in the last 5 years, agentless sotware (for example some Intrusion Detection products) has become prominent. Such applications simply listen to the network and initiate action if they “don’t like what they hear”. The point is that applications don’t have to live in server blade cabinets. You can put them on switches or you could put them onto server boards that sit in a big switch cabinet. They’re very portable.
  4. The network needs an OS (or NOS). Whether Cisco has the right OS is a point for debate, but the network definitely needs an OS and the OS needs to perform the functions that Cisco’s NX-OS carries out. It also needs to do other things to like optimize and load balance all the resources in a way that corresponds to the service level needs of the important business transactions and processes it supports. Personally, I do not see how that OS can do anything but span the whole network - including the switches."
Would all applications run this way? Probably not. But those mission critical, highly distributed, performance-is-everything apps you provide for your customers, or partners, or employees, or even large data sets, are extremely good candidates for this way of thinking.

Oh, and I wouldn't be surprised if Google, Microsoft, et al. agreed (though not necessarily as Cisco customers).

Does Nexus work? I have no idea. But I am betting that, as private clouds are built, the idea that servers are the center of the universe will be tested greatly, and the incredibly important role of the network will become more and more apparent. And when it does, Cisco may have positioned themselves to take advantage of the fun that follows.

Its just too bad that it is another single-vendor, closed source vendor offering that will take probably 5-7 years (minimum) to replicate in the open source world. At the very least, I hope Cisco is paying attention to Doug's observation that:
"[T]here will be a standardization of the hypervisor ‘interface’ between the VM and the hypervisor. This will allow a VM created on Xen to move to VMWare or Hyper-V and so on."
I hope they are openly seeking to partner with OVF or another virtualization/cloud standard to ensure portability to and from Nexus.

However, I would rather have this technology in a proprietary form than not at all, so way to go Cisco, and I will be watching you closely--via the network, of course.

Tuesday, June 10, 2008

Eucalyptus and You

Last Friday night I came across a post by Sam Dean of OStatic, titled "Eucalyptus: Unsung Open Source Infrastructure for Cloud Computing", and my jaw fell to the floor. Here it was, the project I wondered why no one was building; a project focused on replicating Amazon APIs in an open source cluster environment. The more I read Sam's post, the more I thought "Man, is this project in the right place at the right time."

I immediately Twittered the link, and was retweeted by no less than Don MacAskill and Dion Hinchcliffe in a matter of minutes. A few hours later, Simon posted his excitement, and then this morning I came across an analysis by Todd Hoff of highscalability.com that I think sums up what we know today quite nicely. Todd heard about this through the Cloud Computing group on Google Groups, and that thread was kicked off by Khazret Sapenov, himself a very prolific cloud thinker.

This is big stuff, despite the skepticism of some cloud fanatics who can't grep why "private clouds" (I am beginning to like that term) are legitimate. I most certainly don't fall into that particular camp, having real experience working with customers who realize that they have to start with an in-house cloud to satisfy corporate and legal mandates. Ideally, though, this infrastructure would allow them to migrate all or portions of their applications out of house when the time and technology are right. If Eucalyptus can pull this off and really provide a killer Amazon clone for private deployments, they may become the core technology for an awful lot of enterprise SLAuto platforms in years to come.

Of course, they are a hell of a long way from achieving that. Todd's post gives a fairly good overview of what Eucalyptus is, but there is still much to do from the technical, functional and marketing standpoints. For example:

  • As the Eucalyptus team notes themselves, its still missing key command line tools.
  • It doesn't appear to be an infrastructure optimization approach, but rather a straight forward clustering approach. Thus, all of your capacity likely must remain running continuously when using the out-of-the-box functionality. I'd like to see them tackle SLAuto when they have the Amazon tools completed.
  • It is thoroughly dependent on the Rock cluster project. Knowing my enterprise IT friends, this won't "go down easy" for any of them.
Interestingly enough, while I was writing this, the Eucalyptus home page was temporarily unavailable. I hope this means that it is overwhelmed with interest. I'd really like to see this community grow substantially, and for the project to evolve very quickly from where it is now.

Simon's observations about portability are really at the heart of my excitement. Realistically, the Eucalyptus team has simply started a journey of 1000 miles with this single step. Congratulations, guys, on setting the pace.

Tuesday, May 06, 2008

Important new blog: cloudsecurity.org

I was hunting around Data Center Knowledge today trying to find the link to my favorite news story of the day (the theft of Peter Gabriel's servers from his hosting company--more on that later), when I came across a small item on today's Roundup about Craig Balding's new blog, Cloud Security. I don't know Craig from Adam, but I will say that the few posts he has put up to date are timely, thoughtful, and covers a topic near and dear to many of our hearts. Not to mention the fact that he got a gig on NPR about 10 posts into the blog's existence. Lucky bastard.

My only beef to date with Craig is his definition of cloud computing (definitively grid centric), but given the fact that there is no agreed upon definition to date, I'll let my comment on his post speak for itself. In the meantime, he has been added to the reading list.

Saturday, May 03, 2008

Thinking about SLAuto in a frenzied cloud

I've been quite silent for a week or two, mostly because of my responsibilities as a sales engineer; doing my part in closing key deals for my employer. I've spent this time sitting in meetings, installing and configuring software, and measuring power savings in large dev/test lab installations. (By large I mean hundreds approaching thousands of servers.) All in all, its been a successful couple of weeks, but its kept me from keeping too close an eye on the big news coming out of the cloud and utility computing markets.

However, as I thought about this more, I realized that I have drifted significantly from my core subject, Service Level Automation (or SLAuto), in the last six months or so--mostly due to the incredible burst of cloud computing innovation to be announced and/or delivered in that time frame. I still believe that there are two key components to an open cloud market that scales:

  • Portable platforms that allow customers to change vendors on a whim
  • Automation that takes action to acquire, release or replace services based on pre-determined service targets

The latter, simply said, is SLAuto.

Of course, what is happening is sort of the nascent birth of cloud computing technologies, where the DNA hasn't had a chance to recombine to build long term survivability into any given "species" yet. We all knew that AWS was doing cool things, but who knew that they would cross the chasm in terms of customer demand as completely as they did? Yet, there is no portability story for Amazon (at least not off of Amazon); and the market forming for SLAuto (see RightScale and others) is tightly tied to the Amazon platform.

The rest of the "big" announcements are worse: Microsoft has no concept of management in Live Mesh (other than synchronization) that I can see, and Google and Yahoo are both building platforms with developers in mind, where service levels are a business agreement, not a platform differentiator. I understand we are taking baby steps here, but I wonder how long it is before corporate IT realizes that they are both a) locked in (at least in an economic sense), and b) paying too much to operate software that doesn't even run in their data center.

Now, I say all of this, but truth be told, most corporate IT shops don't do SLAuto today. So, why should this change in the cloud? I hinted at it earlier: scale. Not scale of functional execution or data access, as we usually think of the term, but scale of market--the speed at which companies will need to respond to the ever evolving marketplace for cloud services and platforms. As self-professed "open" nature of Google and Yahoo's platforms become more of a reality, combined with true innovation in "industry" standard APIs (for capacity management, code platforms and feature integration), there is little doubt that pressure will be on the IT shop to optimize the cost of delivering business services to the rest of the company. Again, I argue that this cannot be done without SLAuto. Prove me wrong.

I am really concerned that SLAuto is still considered "bleeding edge" in most IT shops. Its not rocket science, and the future of IT cost management almost certainly has to be built around it. On the other hand, perhaps as some of these customers I worked with the last couple of weeks serve as references to the value of SLAuto--at least in terms of energy costs--more of them will understand its urgency.

Friday, April 18, 2008

Intuit Makes Play for Finances In The Cloud

It's end of quarter time here at Camp Cassatt, and my Sales Engineering role will probably overwhelm my Technology Evangelist/Field Technologist/Trendy-Title-of-the-Month role. This, unfortunately, means fewer blogs at an incredibly fast moving, "heady" (as they say) time in autonomic/utility/cloud computing.

The biggest news of a very news filled several days is that Intuit will announce today that they will launch a PaaS (Platform as a Service) offering for developers wishing to target the 3 million strong QuickBase market. I came across this news via Bob Warfield at SmoothSpan, and his analysis of the coming announcement is thorough and very intriguing. You may or may not know this, but Intuit is completely changing their business model, moving away from shrink-wrap and going completely towards a SaaS/PaaS model.

The PaaS offering (officially the "QuickBase Developer Program") is in private beta right now (request an invite), and perhaps most closely aligned with Salesforce.com's force.com offering; namely it provides a platform for developers to add value to Intuit's existing and coming online products. The platform uses Adobe's Flex for client development and QuickBase for the server, a decision that clearly meets with Bob's approval. (Hmmm. Which to learn first, Python or Flex?) See Bob's post for the nitty gritty details, however.

I am most excited about the philosophical difference between Intuit's approach and Salesforce.com's approach. First and foremost, Intuit seems to be stating up front that they are committed to supporting data portability, and keeping it relatively simple for developers to move data off of their platform. Bill Lucchini, VP and GM of Inutit QuickBase was directly quoted as saying:

We never want to lock anyone in. But we want the customer to choose us because we offer more value. That’s why we didn’t create our own language like Apex, we chose Flex. We won’t stop anybody from leaving. Vendors have to double down and work harder to keep customers loyal.
Now, truth is that if no one else uses Flex (except maybe Adobe) and you are using Intuit's libraries, the code is locked in...at least until Intuit either open sources or licenses its platform components to other capacity providers. However, data is another issue, and having a PaaS vendor commit to data portability is incredibly refreshing.

Update: Bob and I have had a little comment exchange [1] [2] [3] on whether or not there really is lock-in here. Bob makes a good point about the portability of a Flex client, but I counter it is the end-to-end functionality that is locked in to Intuit, not just the client, and that this is probably OK with a large class of potential users. Just not large enterprises.

Wednesday, April 09, 2008

What Google App Engine is NOT

Simon Wardley wrote a post discussing the Google App Engine announcement as a "first step" for them in the "the web as an operating system space". Simon is right, but as I commented on the post:

As I just noted on my blog, perhaps it is critical to look at this from the perspective of web businesses, rather than from enterprise IT's perspective. From the former angle, this is disruptive and revolutionary; from the latter, its a no-op at this point, except perhaps for externally facing web apps.
Simon then wrote an interesting post in response, describing the opportunity that Google has created by open sourcing the App Engine SDK. His core premises can be summed up in the following quote:
Now, whilst Google hasn't provided their environment as open sourced, it has provided an open sourced SDK that "emulates all of the App Engine services on your local computer". This appears, though I'm not a python expert, to contain all the primitives and information needed to build a compatible environment to GoogleAppEngine. This allows for companies, vendors and ISPs to create competing but compatible systems. It's almost as if Google has offered a blueprint for a web operating environment and asked the rest of the community to come compete with them.
And here I have to say, "Well, true, as far as web application hosting goes. But we all know the enterprise is WAY more than that." I think if a commercial product came out that allowed anyone to build a high-scale web environment, with data storage, development tools and operations interfaces within their own infrastructure, that would be very cool. But, as someone who really understands the utility computing space, I want everyone to be clear that this wouldn't help scalability or optimizing resource usage in the following key IT areas:
  1. Portal Services - Yes, an archaic concept to some, but still a critical strategy for delivering work functionality and key information to most knowledge workers. Note that Google does not provide portal support, nor support ANY standard portal interfaces, though you may be able to hack that in Python.

  2. SOA architectures - While it is theoretically possible to build a REST service in App Engine, there is no mechanism to host any other form of services. Yes, you could theoretically leverage services external to the Python app, but this would probably require services and GUI to be located in the same network, to avoid latency issues. Not to mention the fact that there is nothing resembling a messaging infrastructure, or Enterprise Service Bus.

  3. Business Process Automation - This is one of key tactics for gaining business agility, in my opinion, and while I wouldn't doubt someone will write an app to do BPA/I in App Engine, it will be expensive from a resource usage perspective (lots of in/out traffic, storage for quiesced processes and so on).

  4. EAI - Enterprise integration is still the most customized element of IT today, and, as noted in the last two points, there is nothing provided by Google at this point to help with data or application level integration; no data transformation (ala Informatica), no messaging engine, no business process automation, etc., etc., etc.

  5. HPC - Yes, Google is amazingly scalable, but they went out of their way to insist that App Engine is not a grid. It is not designed to--nor do you have the quota to allow you to--send arbitrary compute intensive jobs to the engine for processing.

  6. Server and desktop virtualization - No one does desktop in the cloud today, as far as I know, but Google doesn't even provide virtual servers--useful for hosting and maintenance of legacy applications, if nothing else. I suppose you could run out and convert your productivity apps to Google Apps, your email to GMail, etc., but what about print services?
Not to mention the fact that Google provides no service level guarantees (though I think they will probably do something here when they go GA), no premium support, no integration services, no live customer support (that I know of); in other words, there is a distinct lack of a "throat to choke" here.

Thus, I think most enterprises need to look at Amazon and Google services as just that--services that can be leveraged within their own architectures when it makes sense, rather than wonder-tools that can replace their entire IT infrastructure expenditure. Again, there is probably more bang for the buck today in converting that existing infrastructure into a utility, unless your data center hosts only web-facing applications...but then there is the expense of rewriting them entirely in Python, which may cancel out a tremendous amount of the cost benefits of using App Engine.

So, Simon, I share your excitement about the future of scalable web applications, but my point remains--this is largely a no-op for most enterprise IT organizations.

Tuesday, April 08, 2008

Google App Engine: Forte Software for the Cloud?

I was rather harsh on Google App Engine last night, and I think with good reason. However, as I read more about it today, I am realizing that there is more to this product for web businesses than there is for your typical enterprise. Looking at it from that angle, let me talk about the compelling aspects of Apps Engine for those developing the types of applications that environment is intended to support.

Let me start with some history. In the mid to late nineties, I was a consultant for Forte Software, the Paul Butterworth led distributed application development and deployment tools company. Forte was an amazing company to work for, but it had an even more compelling product to work with.

The basic concept was derived from a simple development scenario. Paul invisioned allowing a developer to:

  1. Write an applications as if they were monolithic, locally executable applications

  2. Name specific objects in the application as "service objects" to act as key interface points (important later)

  3. Test those applications in a local-only configuration

  4. Use a GUI tool to partition the application by dragging and dropping the service objects around the environment as necessary. Developers could also configure service objects to be replicated for load balancing, failover or both.

  5. Test execute the application in its distributed configuration

  6. Deploy and operate the finished application in its final partitioned configuration

  7. Monitor the distributed application and its components for both availability and performance characteristics
Though based on a 4GL at a time that Java was pushing for "open languages", Forte proved to be a very popular tool in a variety of extremely high scalability settings: OnStar, EZPass, Marriott online reservations, the New York state sex offender web site to name but a few.

It wasn't the 4GL that made the product compelling (though it was very good), and certainly not the developer GUI (that was well below average), but this end-to-end developer experience that made the product a winner.

Now flash forward to today, and the TechCrunch article covering their developer's experience in developing and deploying a decent little app in about 4 hours, including deciding on requirements, writing code, debugging, deploying and "launching" on the crunchbase.com domain. In reading through their step by step activities, I was struck hard by the similarities with the Forte experience, with a few positive differences:
  • The tools are now open source themselves, and based on an open source language

  • The need for application partitioning is largely eliminated. Note I said largely, as if you are using a service-based architecture, you will have to hand-code the outbound calls to any services via Google's URL API.

  • Deployment and monitoring is automatic. You never have to worry about what was deployed where when. The capacity is just there (up to your quota).
Now, all of this comes with a cost (which was true of Forte as well): you must agree to living in a proprietary world. In a later post, I am going to talk about another cost (which is common with other platforms): start-up lock-in; suffice to say, your lock-in isn't just the available languages or the libraries you *must* use, but its also the dependency on all of that infrastructure automation that is Google's and Google's alone.

There are also many key application components which seem logically locked into Google: identity, domain management, monitoring and data storage/retrieval. Not necessarily a bad thing, but developers should go in with their eyes wide open.

However, if time to market is your biggest concern, and all you care about is cool web application capabilities, then you now have two choices: Amazon (via Heroku and Zend, for instance) and Google (via App Engine). Each has its language and its limitations, but the experience is largely the same. (I haven't checked to see if the "launch"--e.g. domain assignment--capabilities of Heroku or Zend, match Google's, though, and it doesn't appear that identity services are covered at all.)

None of these really give you service level guarantees, so SLAuto doesn't really apply. However, service levels will be assumed, so if you care, start looking at SLAuto tools that may help in the future.

Again, all of this probably does not apply to enterprise IT, but its a hell of a compelling story for web developers.

Monday, April 07, 2008

Google announces ultimate cloud lock-in platform

I was about to write a long post about how all the big guys are starting with storage as a cloud service (based on the rumor that Google was going to announce BigTable as their first cloud service, and HP's new offering), when I took the time to watch Scoble's (unintentially) multi-part coverage [1] [2] [3] of the mysterious Google announcement (on Qik). And--just to screw with me--do they announce a data-only offering? Of course not, they announce Google App Engine.

Update: Here is a link to the official Google coverage of the announcement on YouTube.

What is Google App Engine? Well, detailed coverage is all over the web; see:

Mike Arrington (TechCrunch)
What this all means: Google App Engine is designed for developers who want to run their entire application stack, soup to nuts, on Google resources. Amazon, by contrast, offers more of an a la carte offering with which developers can pick and choose what resources they want to use.
Bob Warfield (SmoothSpan) [1] [2] [3]
However, the short-short version is it is a complete scalable and manageable runtime environment to build, test and run scalable web applications. (I don't say "highly scalable" for reasons that will be clear later.) This environment is made up of the following five core components (today):
  1. Scalable Serving Infrastructure - Basically the Google infrastructure, including everything but the Python code and web templates themselves

  2. Python Runtime - All of the infrastructure to deliver and execute your application in a distributed environment

  3. Software Development Kit - Allows you to code your application on your local system before deploying to Google.

  4. Web-based Admin Console - A web application including at least simplistic version management (including rollback), running system statistics and errors, access to the datastore (see below) and access to log files

  5. Datastore - BigTable storage (I don't know enough about BigTable yet to say more)
All of this delivered in a free (as of the beta) limited-scale package:

500MB storage
200 Megacycles CPU
10GB Bandwidth In/Out

Should be around 5 million page views a month for the average web application. This is a reasonable scale, but would not qualify as "highly scalable" in most large web properties' books.

What does this add up to, in my opinion? The ultimate cloud lock-in story. (As background, watch Scoble's first video from about 3:17-5:25.) Not a single thing in your web application will not be dependent on Google if you use this technology--not even your Python code. (For proof, check out the "includes" in the coding demo--at around 8:44 of the first video.) Everything you do will depend on a piece of Google intellectual property. You datastore is BigTable, your operations environment is Web Operations Center, etc., etc., etc.

This isn't cloud computing, its just a cool web app hosting tool. OK, I exaggerate. It is cloud, but its exactly the kind of cloud most enterprises should avoid. If you are building a web business, and this tickles your fancy, go for it. You can't beat the price, and you've got to love the feature set. If you are a Fortune 500 looking for where to launch your next CRM interface, forget it. There are safer ships to sail than this--e.g. Amazon EC2 (et. al.), Mosso, etc.; better yet, convert what you have.

If it sounds like I am being reactionary to this announcement, I suppose I am in a way. Unfortunately, I have spent a lot of time thinking about how today's high-scale business systems will move to the cloud, and I think the market needs more maturity before this can be done safely. You need flexibility of the type and architecture of your application, and which components you choose to leverage. There is no such choice with Google.

The best part of Scoble's coverage was when he talked to two developers at the end (~18:15). One (Michael Malone) notes the biggest problem is "lock-in". The woman standing next to him (Mia Culver) calls it a "proprietary platform".

I love it. There is no fooling this savvy, open source focused market. If you want to win hearts and minds, be open. When the hell are we going to get that application portability standard we've been demanding, eh?

(On a side note, the required demo for cloud application development is now to build a web app from scratch and deploy it so the audience can access it from their laptops in 5-8 minutes. Google did it tonight, and Heroku did it at the Cloud Demo Night earlier this month.)

Some more of my notes from the announcement:
Can't do:
  1. No write to file system. (Reads OK, so you can use props files, etc.)

  2. No direct web calls (instead utilizes "URL fetch" API)

  3. No threads (single thread only, but distributed across multiple systems)

  4. Python only first language, looking for input on next language to attack (must have runtime that can be "hardened")
Administration Console gives the ability to see and manipulate running app code (by version) and data

Is the identity environment for all hosted apps Google login? Is everyone comfortable with this?

The initial 10,000 beta accounts may already be gone.

Quota based, no ability to grow past above for now.

Also, no "offline processing" today, but looking into it for future. (Sounds like batch stuff, etc.)
I have an interesting experiment I wish I could get to. I want to marry Scalr, the open source Amazon EC2 automation environment with a policy-based SLAuto environment to get the ultimate in flexible, open and coding agnostic autonomic operations, both in the cloud and "at home". Anyone want to beat me to it? (Come to think of it, why is Google still hosting Scalr now that App Engine is live? Hmmmm....)

Saturday, March 22, 2008

Eric Schmidt: Please believe me...

ZDNet Asia covered comments from Eric Schmidt of Google regarding the trust issues that enterprises must address before adopting cloud computing. He made these comments during a recent visit to Sydney, Australia. I find the comments interesting, because it signals for me the first public acknowledgment of the challenges that Google faces in selling the enterprise on the cloud vs. in-house applications.

Of course, he couches it in terms of how to choose Google Apps over Microsoft Office, but heart of the issue--trust--applies to just about any choice between traditional "I own it all" IT, and "renting" from the cloud--including compute capacity. (By the way, is anyone still claiming that Google Apps does not compete with Microsoft Office?)

As Eric notes for the Apps/Office debate:

"At some point in your firm, someone is going to say: 'Well maybe there is an alternative in the enterprise', and they're going to do an evaluation. And they're going to say the cloud computing model has its strengths and weaknesses."
This seems consistent for all cloud computing choices: in each case, the IT organization (or even the business) will need to evaluate the costs/benefits of moving data and functionality to the cloud versus maintaining traditional desktop/server systems. Up to now, I agree with Eric, but then he goes on to say:
"What assurances [do you have] that the information you have in your computer is safe--that it is properly stored and so forth? So it's important to understand that you really are making trade offs of one versus the other."
Assuming I am understanding this right, Eric seems to be saying, "Hey, your data isn't really all that secure on your PC, so why don't you just trust us that we will do better?" Ah, there is the rub.

I believe most enterprises would answer,
"Well, if data is misappropriated on my in-house systems, I can hunt down and fire those responsible, and the original copy of the data is still in my control. If Google (or someone who compromises Google) misappropriates my data in the cloud, I can go after the guilty parties, but if I no longer trust Google, I now have a legal battle on my hands to get my data back and get Google to completely delete it from their systems."
This partially gets to data portability, which some are trying to address, but it is not a solved problem yet. However, even with portability, its the "completely delete it from their systems" part that I may never trust without clear and explicit legal consequences and vendor auditing. Until I have full control over where my data resides (at least in terms of vendors) and when and where I can move it and how it gets removed from storage that I no longer wish to utilize, I am putting a lot at risk by moving data outside of my firewalls.

At its heart, I think Eric's statement gets at the core of what Google has ahead of them in terms of delivering Apps to large, established enterprises. I don't doubt that Google will both develop and acquire technology that overcomes many of the security concerns that large enterprises have, but I continue to believe that we will see a major legal case in the next 5 years where a large corporation has to fight in court to get their data from a SaaS/cloud computing provider.

If it were me, I'd look to get cloud-like economics from my existing infrastructure. This is done by utilizing software architectures that are multitennant capable (SOA is a good place to start), and by implementing utility computing type infrastructure in your own data center. No matter how nicely Eric asks, be careful of what you are getting into if you put your sensitive data in the cloud.

Wednesday, March 19, 2008

The Social Enterprise Opportunity

I want to begin today with a quick shout-out to my fellow bloggers at Data Center Knowledge. In a recent post, they identified me as one of the bloggers they follow for cloud and utility computing, and I'm honored to me included among such a strong list of bloggers. (Rich Miller, who posted the list, is no slouch himself.) Update: I violated the cardinal rule of Internet social networking: assuming a given name applies to one person. Rich Miller from Data Center Knowledge is not the same Rich Miller that writes Telematique. My apologies to both.

One of those bloggers is Phil Wainwright, whose Software as Services blog is one of my regular reads. He is the most aggressive, forward thinker in the SaaS space, and he is very often sees opportunity that most of us miss. (Phil's blog is also a great way to stay on top of the companies and technologies that specifically support the SaaS market.)

Phil recently wrote an interesting post about SaaS and Web 2.0 concepts, titled "Enter the socialprise", in which he points out that the very nature of an "enterprise" is changing thanks to the Internet and cloud computing concepts. He notes that loyalty between individuals is replacing corporate loyalty, and that social networking on the Internet is creating a new work economy for individual knowledge workers.

He then goes on to challenge enterprise computing models:

But enterprise computing is still designed for the old, stovepipe model in which every transaction took place within the same firm. There’s no connection with the social automation that’s happening between individuals. Many enterprises even resist talking about social networking. And even when an application vendor adds some kind of social networking features, there’s always the suspicion that they’re just painting social lipstick on a stovepipe pig.

This yawning chasm is an opportunity for a new class of applications to emerge that can harness the social networks between individuals and make them relevant to the enterprise. Or perhaps reinvent a new kind of enterprise, better suited to the low-friction reality of the connected Web. Enter the socialprise.

The example he gives of a company leveraging this is InsideView, which is creating a very cool sales intelligence application that integrates with major SaaS CRM vendor products to aggregate information from a variety of online sources into a single prospect activity dashboard. This is an incredibly cool example of how rich data about individuals within and across firms can be used at an enterprise level.

Another product that is similar that struck me was JobScience, which is one of the companies whose blog is in the Data Center Knowledge list referenced above. JobScience is using force.com to create a rich social intelligence engine for Salesforce.com customers. Their product, aptly called Genius, is an excellent example of what they are able to do. Read the post for all the features, but my favorite is:
The Genius Tracker. Not only does the tracker pop up to tell me an email recipient has just opened my email, or is visiting my web site, but the more important intelligence this gives me is that this prospect is is online and engaged with our solution. If a sales rep can call 40 people in a day, and a blast to 5000 prospects shows me that 40 of those prospects are online and engaged, it doesn’t take a genius to figure out who to call. That rep’s going to have a much more productive day calling people who they know are in the office. Less voicemails, less brushoffs, less calls to people who don’t work there anymore.
Bordering on privacy issues, I know, but an amazing level of detail, and invaluable if used wisely. More importantly, it goes to show what is possible in a stable, shared application environment.

By the way, this direct integration with a given CRM platform by a "value added extender" is an interesting twist to the dependency issues that Bob Warfield writes about on the SmoothSpan blog. JobScience's products are services that become a feature of the destination both visually as well as functionally. Bob's point about being a component provider to the actual product is well taken, and I wonder if the only exit strategy for these guys is acquisition by Salesforce. What else can they hope for as a company dependent on force.com? Talk about cloud lock-in.

Sunday, March 09, 2008

Update on Dataportability.org activities from the source

Interesting interview of Chris Saad and Frank Arrigo (Chris is organizing dataportability.org, and Frank is a Microsoft employee that is somehow related) by Robert Scoble.



Interesting in here is the update on what dataportability.org is focusing on right now--standard "best practices" for open data, and a "logo" to indicate standards are followed--plus the discussion of Silverlight, etc.

Monday, February 25, 2008

Comments on Paul Wallis: Cloud Computing

Paul WillisWallis has an excellent post tying the history of prior utility/cloud/grid computing attempts to the current hype. I've been trying to comment for a while, but haven't been able to get comment submission to work until today. This is a reworking of that response, in case it doesn't get through moderation for some reason.

Let me just say that, contrary to Paul's description of my position may sound to others, I am not blindly "pro-cloud". In fact, I firmly recommend that existing enterprise data centers and applications think hard before going "outside" to a commercial capacity-on-demand provider. In most cases, it would actually be better for such enterprises to convert their own infrastructure to a utility computing model first, while the necessary technologies and businesses mature.

I also define the cloud broadly, to include SaaS, PaaS (e.g. force.com) and HaaS (e.g. Amazon, Mosso, etc.). SaaS is in clearly in play today, HaaS is being experimented with, but PaaS may be the most interesting facet of the cloud in the long term.

That being said, Paul provides very valuable information in this post, and I for one very much appreciate the work put onto it. It is very true that bandwidth is something to be nervous about (especially when Amazon charges as much as it does for bandwidth), and I have had some interesting discussions (such as the one Paul references) about how data integration will happen over the cloud. Finally, cloud-lockin is indeed something to be concerned about; as in, what happens if my first choice provider sucks? Can I move my applications, data, etc. to someone else cheaply enough that it doesn't put me out of business? Simon Wardley has a good post on that today.

Update: Er, two seconds and I could have confirmed the spelling of Paul's last name. Sorry, Paul!

Friday, February 15, 2008

A Day of Storage and the Cloud

My reading began this morning with Nick's covereage of the Amazon S3/EC2/AWS outage. Perhaps most interesting to me, though, were the comments. A variety of people responded to note that we perhaps are holding the cloud to impossibly high standards, while others noted that this was supposed to be a distributed service, and an extended downtime like this indicates a certain lack of redundancy. I find this facinating, in light of the recent discussion of cloud lock-in. Not surprising, just facinating.

Let me explain.

While I remain extremely concerned about the proprietary operational approaches taken by most "capacity-on-demand" providers--many based on open source platforms, ironically--I think it is important to acknowledge that:

  1. 100% uptime is unreasonable for any platform in its infancy, including S3
  2. Not everyone will be negatively impacted as much by a three hour outage as some
  3. SLAs should be set if service is extremely critical to a business. Ironically, Amazon has limits on who and what they will provide SLAs for.
  4. Even with a three hour outage, Amazon S3 is probably the best service of its kind...for now.

That last point is critical, as Nick put up another post later in the day highlighting EMC's plans to enter the cloud storage market---in a big way. The competitors to Amazon are coming, and that fact may change the equation for how much leeway Amazon has in the future.

Assuming it is not super onerous to copy data from one provider to another--Storage may in fact be the earliest of the commodity cloud components if this is true--an alternative approach will make it that much simpler for an unsatisfied customer to make a move. This, in turn, will make some who will tolerate an outage now, well, less tolerant.

I anxiously await Amazon's explaination for the glitch.

By the way, Robert Scoble certainly believes Amazon has won the cloud market in its entirety already. He is way off, of course. Do you know how much datacenter capacity there is in corporate America alone? There is no way one company that is spending a fraction of the budget on building new data centers that Microsoft, Google and Yahoo are will create a barrier of entry that high. Amazon is a typical first enterant, ala Netscape. Hopefully the market is different enough, though, that they can build a survivor.

Wednesday, February 06, 2008

Cloud computing heats up

Today's reading has been especially interesting, as it has become clear that a) "cloud computing" is a concept that more and more IT people are beginning to understand and dissect, and b) there is the corresponding denial that comes with any disruptive change. Let me walk you through my reading to demonstrate.

I always start with Nick Carr, and today he did not disappoint. It seems that IBM has posited that a single (distributed) computer could be built that could run the entire Internet, and expand as needed to meet demand. Of course, this would require the use of Blue Gene, an IBM technology, but man does it feed right into Nick's vision of the World Wide Computer. To Nick's credit, he seems skeptical--I know I am. However, it is a worthy thought experiment to think how one would design distributed computing to be more efficient if one had control over the entire architecture from chip to system software. (Er, come to think of it, I could imagine Apple designing a compute cloud...)

I then came across an interesting breakdown of cloud computing by John M Willis, who appears to contribute to redmonk. He breaks down the cloud according to "capacity-on-demand" options, and is one of the few to include a "turn your own capacity into a utility" component. Unfortunately, he needs a little education of these particular options, but I did my best to set him straight. (I appreciate his kind response to my comment.) If you are trying to understand how to break down the "capacity-on-demand" market, this post (along with the comments) is an excellent starting place.

Next on the list was a GigaOm post by Nitin Borwankar stating his concept of "Data Property Rights" and expressing some skepticism about the "data portability" movement. At first I was concerned that he was going to make an argument reinforced certain cloud lock-in principles, but he actually makes a lot of sense. I still want to see Data Portability as an element of his basic rights list, but he is correct when he says if the other elements are handled correctly, data portability will be a largely moot issue (though I would argue it remains a "last resort" property right).

Dana Blankenhorn at ZDNet/open-source covers a concept being put forth by Etelos, a company I find difficult to describe, but that seems to be an "application-on-demand" company (interesting concept). "Opportunity computing", as described by Etelos CEO Danny Kolke describes the complete set of software and infrastructure required to meet a market opportunity on a moments notice. “Opportunity computing is really a superset of utility computing,” Kolke notes. Blankenhorn adds,


"It’s when you look at the tools Kolke is talking about that you begin to get the picture. He’s combining advertising, applications, the cash register, and all the relationships which go into those elements in his model. "

In other words, it seems like prebuilt ecommerce, CRM and other applications that can quickly be customized and deployed as needed, to the hosting solution of your choice. My experience with this kind of thing is that it is impossible to satisfy all of the people, all of the time, but I'm fascinated by the concept. Sort of Platform as a Service with a twist.

Finally, the denial. The blog "pupwhines" remains true to its name as its author whimpers about how Nick "has figured out that companies can write their own code and then run it in an outsourced data center." Those of you that have been following utility/cloud computing know that this misses the point entirely. Its not outsourcing capacity that is new, but its the way it is outsourced--no contracts for labor, no work-order charges for capacity changes, etc. In other words, just pay for the compute time.

With SLAuto, it gets even more interesting as you would just tell the cloud "run this software at these service levels", and the who, what, where and how would be completely hidden from you. To equate that with the old IBM/Accenture/{Insert Indian company here} mode of outsourcing is like comparing your electric utility to renting generators from your neighbors. (OK, not a great analogy, but you get the picture.)

Another interesting data point for measuring the booming interest in utility and cloud computing is the fact that my Google Alerts emails for both terms have grown from one or two links a day, to five or more links each and every day. People are talking about this stuff because the economics are so compelling its impossible not to. Just remember to think before you jump on in.

Sunday, January 20, 2008

Evidence of pending doom and imminent salvation...

Two news articles that occurred as soon as I went offline for the birth of my daughter provide increasing evidence of the importance of service level automation and image portability between vendors:

  • Joyent, one of the most ambitious new "capacity on demand" managed hosting services, has experienced a multi-day outage that has affected two of their prime storage services. No failover path was available to users of the services, and there is no mention of functionality or services to assist customers with moving--temporarily or permanently--to another vendor's service. Odds are high that some of these customers have lost access to key data, or are flying without substantial backups to key systems. Any decision to move to a different servers (like Twitter will according to the post) is on the customer's own dime.

    A prime example of the dangers of vendor lock-in that Simon and I have been warning you about...


  • Oracle has announced its intention to build and sell "Grid 2.0" technology that will target--yes, you heard right--service level automation. Welcome to the SLAuto game boys. I hope you're ready to talk standards for image and policy portability; as well as policy platform interoperability. Otherwise, you're just creating a new DB grid "silo", and not helping anyone in the long run. Please, feel free to educate me if you think otherwise...

These events show the caution that users of cloud services must employ. Be ready to take on increased integration responsibilities as you deploy more and more elements of your datacenter to the cloud, automate more of the management of those elements, and find the product landscape one in which there (still) is no silver bullet. You may not be writing apps, but you sure as heck will be writing the orchestration that will tie the apps you employ into a cohesive business process ecosystem. You may also find yourself writing backup integration again, just in case you experience "Joyent 2.0"...

Friday, January 11, 2008

The Compute Grid is Like Nothing Before It

In a continuation of the discussion regarding Nick Carr's "The Big Switch: Rewiring the World from Edison to Google" and Yochai Benkler's "The Wealth of Networks: How Social Production Transforms Markets and Freedom", I want to focus today on the shortcomings of the electric utility analogy--or any other analogy I have heard of for that matter--in describing the compute capacity utility story. It is important to note that, while the electricity-as-utility story has dominated the utility computing discussion to date, other interesting analogies have been put forth lately that enlighten some aspects of the compute story while clouding (no pun intended) others.

Let's start with the electric utility analogy that Carr focuses on in his work. Nick does an excellent job of laying out both the history of electric production and distribution in the United States, as well as mapping those to similar aspects of compute utilities. As Nick puts it:

"The commercial and social ramifications of the democratization of electricity would be hard to overstate...Cheap and plentiful electricity shaped the world we live in today. Its a world that didn't exist a mere hundred years ago, and yet the transformation that has played out over just a few generations has been so great, so complete, that it has become almost impossible for us to imagine what life was before electricity began to flow through the sockets in our walls.

Today we're in the midst of another epochal transformation, and its following a similar course. What happened to the generation of power a century ago is now happening to the processing of information. Private computer systems, built and operated by individual companies are being supplanted by services provided over a common grid--the Internet--by centralized data-processing plants. Computing is turning into a utility, and once again the economic equations that determine the way we work and live are being rewritten."
OK, so its hard to argue with the basic premise that we are undergoing a change that is similar to the introduction of cheap, readily available electricity in the early twentieth century. Nick is a master for pointing how the evolution of electric technology fed changes in societal norms, and vice versa. "It's a messy process--when you combine technology, economics and human nature, you get a lot of variables", he writes, "but it has an inexorable logic, even if we can trace it only in retrospect."

Unfortunately, the same can be said about a variety of other technical advances that didn't end up looking like the electric marketplace; take manufacturing, food production, and music and film production, for example. All of these have elements that can be seen as paralleling utility computing, social production or both. Yet none of them really map completely, and the flaws in the analogy have a "chaos"-like ability to magnify as history bears out.

Now, to Nick's credit, he does start Part 2 of the book--his in depth comparison of the social implications of utility computing--with the following comments:
"Before we can understand the implications for users...we first need to understand how computing is not like electricity, for the differences between the two technologies are as revealing as their similarities."
He goes on to highlight the following differences, using them to make key points about how the effects of compute utilities on society may not be nearly as beneficial as the effect of electric utilities:
  1. With electricity, the applications of the commodity lie outside of the utility--i.e. the appliances, electronics, lighting, etc. that consume the power. With computing, the applications themselves are deliverable over the network, and can be shared by anyone that wants to (and is allowed to) use them.


  2. Computing is much more modular than the electric grid, meaning that the components that make up the commodity service (storage, processing, networking) can be split up and offered by a variety of different parties.


  3. The compute utility is programmable; it can be made to perform a variety of custom tasks are required by its customers. Electricity from your basic power outlet is a fixed state commodity--there are exacting standards to what it is and how it is delivered, as well as laws of physics that limit how it can be used.


  4. Choosing an electric utility was generally an all-or-nothing choice; you either got power from the grid, or you had your own power generation. The modularity of computing, however allows for a slow transitional change from private to public consumption. (I think there is a serious flaw in this analogy, for what its worth. Look at the increasing installation of solar power systems in residential applications--all while remaining a part of the grid. This seems to indicate a gradual transition to a hybrid public/private power grid in the electricity space.)


  5. The compute utility allows others to participate directly in creating value for the utility, and do so cheaply and simply. Providing power to the electric grid has always been expensive and very technical (as, I have to admit, is true in my objection in point 4).
These are excellent examples, and are all important to note (even point 4). However, I think Nick fails to note the most important difference between electricity and data processing; namely

data != electricity

There are huge implications to what is being moved over the network versus what is being moved over the power grid, beyond just the programmable elements. These differences are critical when analyzing the compute as utility story, and its a shame he doesn't address them.

For example, checking his index for the terms "security", "data security" or "software security" shows exactly zero entries. When talking about the transition of data vs electricity, it seems critical that one consider the sensitivities that people and organizations have about how it is transmitted. "Privacy" is the subject of a 7 page essay highlighting what we have been willing to give up so easily, but he basically uses the subject to highlight a specific trait of the network without investigating how related issues will cause compute capacity to differ from electricity. My own opinion is that these two subjects--security and privacy--are exactly what will slow down the "total conversion" to centralized computing utilities for customers like banks, classified federal bureaucracies and health care. I spoke of this in detail before.

As noted earlier, others have commented on some of these issues, and have used other analogies like manufacturing to counter the electricity analogy. One excellent example of this is an an article by Michael Feldman of HPCwire in which he argues that a better analogy is food production. As he puts it:

"When food became a commodity, agribusiness conglomerates took over and replaced lots of family farms with much larger, more efficient "factory farms." Today, crops like wheat and soybeans are typically grown on multi-hundred acre land parcels. But not all food products are easily commoditized. Specialty fruits, vegetables, and organic products don't usually lend themselves very well to large-scale production. According to the U.S. Department of Agriculture about a quarter of farm revenue is still generated on family farms. Many of these farms are focusing on these specialty items and have formed cooperative arrangements in order to remain economically viable."

This analogy works from the standpoint that it describes a system in which people care about the varying qualities of the service output by the "utility". For example, we all know the amount of effort spent by the FDA and others to make sure our meats aren't tainted with deadly bacteria. In fact, some specialty food producers have built their marketing message around food safety and health, and many of those are small, boutique producers. Other small players have provided specialty food items to very specific markets with great success. I have believed all along that the compute market will evolve into a few major players and hundreds (thousands?) of small boutique specialty players, especially in the SaaS space. ("Special SaaS with that?" Please forgive me...)

Unfortunately, the food analogy also breaks down in one critical way:

data != food

In this case, its the real, physical nature of food, and the accompanying issues with logistics, cost of production (including fixed real estate costs), and brick-and-mortar sales that don't compare well to the zero marginal production cost nature of data. Replicating food and shipping it to a new customer destination are expensive acts; doing the same with data costs nearly zero. Furthermore, geographic location means nearly nothing for computing. Food, on the other hand is subject to cultural, climatological and logistical limitations to where it can be produced and sold.

For this reason, computing will tend to a much higher level of centralization than food production has seen. Intuitively, one must believe that this will lead to larger displacement of private data centers than would have happened if it was more expensive to share infrastructure.

I'm still trying to digest all of this, but I have a growing feeling that Carr's dependency on the "Edison analogy" (to coin a phrase for no good reason) actually limits the likelihood of some of his arguments. He also seems to assume that the economics of the web won't evolve much from where it is today--largely advertising based, with millions of people willing to do stuff for free and few existing cultural industries willing to produce for online audiences. I want to bring Benkler back into the conversation when I cover this in a later post.

(One side bar on the commercial production of online content: did anyone see the news from NBC today?)

Monday, January 07, 2008

7 Businesses to Start in 2008

Rather than offer a list of predictions for 2008, I thought I'd have some fun suggesting some businesses that could make you money in 2008 or the few years following.

  1. SaaS<-->Enterprise data conversion practice: All those existing enterprise apps will need to have their data migrated to that trendy new SaaS tool; and should anyone actually decide they hate their first vendor, they'll be spending that money again to convert to the next choice. Perhaps they'll even get fed up and return to traditional enterprise software. Easy money.
  2. Enterprise Integration as a Service: No matter how much functionality one SaaS vendor will provide, it will never be enough. Integration will always be necessary, but where/how will it be delivered? Go for the gold with a browser based integration option. Just figure out how to do it better/cheaper/faster than force.com, Microsoft, Google, Amazon, etc...
  3. SaaS meter consolidation service: Given the problem stated in 2 above, who wants 5 or 6 bills where its impossible to trace the cost of a transaction across vendors? Provide a single billing service that consolidates the charges of the vendor stable and provides additional analytic capabilities to break down where costs and revenues come from. Then get ready to defend yourself against the data ownership walls put up by those same vendors (see 4 below).
  4. SaaS/HaaS Customer litigation practice: Given the example of Scoble's experience with Facebook, there are clearly a lot of sticky legal issues to be worked out about "who owns what". Ride that gravy train with litigation expertise in data ownership, vendor contractual obligations and the role of code as law.
  5. SaaS industry (or SaaS customer) data ownership rights lobbyist: Given 4 above, each industry player is going to want their voice in congress to protect/promote their interest. Drive the next set of legislation that screws up online equality and individual rights.