Thursday, January 24, 2008

Data propagation and software fluidity

Jon Udell has an interesting post commenting on Jeff Jonas' explaination of Out-bound Record-level Accountability in Information Sharing Systems. The central thesis of Jeff's post is that tracking who specifically received a given datum is very expensive yet highly necessary in many applications. The example given is that of a user who wishes to no longer receive email from a site they have an account with, or any of the other sites that the original site shared that preference with. How does the original site know who to contact? The high cost is a result of the difficulty in tracking who data has been forwarded to.

John replies very simply that a "publish" model, much like blogging, might be the answer. "Data blogging", coined by fellow blogger Galvin Carr, refers essentially to the problem of syndication, but Udell projects that to a much wider arena of data types. As he notes, there is much evidence out there that "push" models are generally only applicable to edge systems calling "inward". "Publish and Subscribe"-type pull models are far easier to implement when running "outward" from the cloud to edge systems (as well as, generally, within the cloud--aka event-driven architectures).

There are two valuable results of this approach:
  1. The originating system can require users of data to subscribe with a unique identity, and each "pull" of published data could be tracked (if necessary) to identify who is up to date and who isn't.
  2. For software fluidity purposes, it further decouples the originating system from its subscribers, meaning both the subscribers and the originating system can be "moved" from physical environment to physical environment with no loss of communication. The most negative action that could take place here is if the originating publisher's DNS name changed in the course of the move, but redirects and other techniques could even mitigate that issue.

I am commenting on this, of course, largely for the second item. Access to data, services and even edge devices must be very loosely coupled to work in a cloud computing world. This is one great example of how you could architect for that eventuality, in my opinion.

No comments: