Today at CommunityOne in New York, we’re announcing a bunch of Cloud-related stuff. Some of it has my fingerprints on it. This is my personal take on the interesting parts.
[Disclaimer]: Like it says on the front page, I work for Sun and sometimes even speak for it officially, but not in this blog. These are my own views as a project insider, and the perceptions of what it is and why it matters are mine; the company’s may differ.
Back Story · Just before Christmas, the group I’m in morphed into the Cloud Business Unit. My boss called up and said “That’s not for you, right? Want to move over to GlassFish/Web-tier land?” I said “Hell no, I don’t really grok Cloud but then neither does anyone else. Put me in, coach.”
So, starting right after New Years, I’ve been cloudworking with a bunch of people in various Sun shops and the folks from our recent Q-layer acquisition. After a few years in loosely-defined strategy and intelligence and evangelism work, it’s been a real thrill to buckle down and grind away on stuff with a view to shipping it.
The Announcement · We’re going to be rolling out a Sun Cloud offering later this year. It’s going to have a storage service that’ll offer WebDAV and something S3-flavored. Also, there’ll be a compute service, based partly on the Q-layer technology.
And it’s got an API.
The API · This is a unified view of the storage and compute and networking services. It’s built around the notion of a “Virtual Data Center” (VDC), which contains networks and clusters of servers and public IP addresses and storage services. The idea is to give you the administrative and operational handles to build something really big and ambitious. The VDC notion is really slick and I think something like it is going to be required in any serious cloud API going forward.
At the bottom level, the interface is based on HTTP and tries hard to be RESTful. All the resources—servers, networks, virtual data centers—are represented in JSON. [Don’t you mean XML? -Ed.] [Nope, JSON is just right for this one. -Tim]
We even tried to do the “Hypertext as engine of application state” thing. To use the API, you need one URI to get started; it identifies the whole cloud. Dereference that, you get some JSON that has URIs for your VDCs; dereference those, and you get more JSON that represents your clusters and servers and networks and so on. This has the nice side-effect that the API doesn’t constrain the design of the URI space at all. [Who cares? -Ed.] [Stay tuned. -Tim]
This interface does more than just Resource CRUD; there are operations like “Start server” and “Snapshot storage”. The kind of thing that classic REST design patterns don’t really give you a cookbook for. Here’s an example of how it works: the representation of a server includes a bunch of “controller” URIs; a POST to one of these constitutes a request to do something, say reboot the server.
On top of the low-level REST there’s a set of libraries for those who’d
rather not deal with HTTP messaging; available off the top in Java, Ruby,
and Python. (Hmm, the other day I saw somebody check something into a
directory called php
, but that’s not a
commitment).
On top of that there’s a command-line interface suitable for shell-scripting, except for it emits JSON instead of classic-Unix lines-of-text. I wonder how that will work out?
Finally, there’s a Web GUI so you can build your VDC by dragging and dropping things around. It’s nice demo-ware and I can see people using that for getting a quick ad-hoc server deployment on the air on short notice. But my bet is that for heavy lifting, you’re going to want to script your deployments, not drag-and-drop them.
Zero Barrier to Exit · Maybe the single most interesting thing about this API is that the spec is published under a Creative Commons “Attribution” license, which means that pretty well anybody can do pretty well anything with it. I’m pretty convinced that if Cloud technology is going to take off, there’ll have to be a competitive ecosystem; so that when you bet on a service provider, if the relationship doesn’t work out there’s a way to take your business to another provider with relatively little operational pain. Put another way: no lock-in.
I got all excited about this back in January at that Cloud Interop session. Anant Jhingran, an IBM VIP, spoke up and said “Customers don’t want interoperability, they want integration.”
“Bzzzzzzzzzt! Wrong!” I thought. But then I realized he was only half wrong; anyone going down this road needs integration and interoperability.
So that’s what we’re trying to do here. We’ve done a lot of work to keep the interfaces generic rather than Sun-specific, and I think we won’t be the only provider of cloud-computing services through this API.
A Work In Progress · Not only is the API CC-licensed and free for use by anybody, it’s not finished yet. We’ve got a lot of back-end infrastructure already built, but there’s still time to refine and improve the API before we’re in ship/lockdown mode. The work’s being done in public over at a Kenai.com project called The Sun Cloud APIs. The spec-ware is on a set of wiki pages starting here. If you want an introduction, the place to start is “Hello Cloud” — An illustrative walk-through of the Sun Cloud API.
If you want to be part of the design process, get yourself a Kenai login and join the project. That gets you a ticket to the forums (which have an Atom feed, thank goodness). There’s no rule saying committers have to be Sun people, down the road; this should be a meritocracy.
How about taking this to a standards organization? I suppose I’d be OK with that once there are a few implementors, and proof that it works. We’re confident that we can build infrastructure behind every interface that’s in there now, which is good; if someone else could do so independently, that’d be better. If we were going to do that, my feeling is that the right level to standardize would the REST/HTTP interface; let implementors compete to offer slick high-level programming-language APIs.
Why REST? · It’s a sensible question. The chief virtue of RESTful interfaces is massive scaling. But gimme a break, these are data-center management operations; a typical transaction frequency would be a single-digit number per week, with the single digit often being “0”, and it wouldn’t be surprising if a big multi-cluster staged-boot operation had a latency of minutes. The data-center controls are unlikely to be a bottleneck.
Why, then? Simply because we wanted a bits-on-the-wire interface. APIs, in the general case, suck; and are really hard to make portable. Bits-on-the-wire are ultimately flexible and interoperable. If you’re going to do bits-on-the-wire, Why not use HTTP? And if you’re going to use HTTP, use it right. That’s all.
However I think we will be forgiven, in this case, for not really sweating the ETags and caching part of the spec yet.
My Fingerprints · I’ve been working on the specification at the REST level. Most of the heavy lifting was done by Craig McLanahan with guidance from Lew Tucker. I played my accustomed role as designated minimalist: the API has become noticeably smaller since I got involved. I suspect Craig is still feeling a bit traumatized by my enthusiastic wielding of the spec machete.
Also I’ve been implementing a glue-code bridge between the REST API
and the Q-layer back-end. It’s in Ruby and so
far I’m talking straight to Rack, the “router” is just a big
case
statement over URI-matching regexps.
I’m not sure, at this point, whether it’s a proof-of-concept or ends up shipping. The Q-layer interface is a moving target; we just completed the acquisition around January 1 and they’re making a bunch of changes to morph the product into what we need for the Sun Cloud.
Open source? Maybe, if it turns out to work. The subject hasn’t even come up.
The Business End · How do you make money in clouds? I’m not convinced that there are big fat margins in operating global-scale infrastructure, competing with Amazon AWS and its peers. I am 100% convinced that if there were a general-purpose platform for running behind the firewall to automate scaling and deployment and take IT out of many loops, there are a whole lot of enterprises who’d love that kind of elasticity in their infrastructure.
Machine virtualization is a big deal, obviously. Lightweight lockin-free data-center virtualization might be bigger, I think.
Comment feed for ongoing:
From: Simon Brocklehurst (Mar 18 2009, at 05:05)
It sounds really great. I've been waiting for something like this for quite some time.
Not sure I believe IBM will see it like that though. Pity. Be interesting to see if the project gets off the ground now...
[link]
From: Christopher Mahan (Mar 18 2009, at 05:37)
You guys did look at YAML, right?
In any case, I think this makes great sense, and I am looking forward to hear more about it!
[link]
From: James Corvin (Mar 18 2009, at 05:42)
No SOAP-based API? Doesn't really seem ready for the enterprise then. I presume there is at least a WADL for it??
[link]
From: Robert Young (Mar 18 2009, at 06:20)
>> Anant Jhingran, an IBM VIP, spoke up and said “Customers don’t want interoperability, they want integration.”
>> “Bzzzzzzzzzt! Wrong!” I thought. But then I realized he was only half wrong; anyone going down this road needs integration and interoperability.
If today's report is true, that Sun will now be a Planet in IBM, then all of this may be for naught; what IBM calls interoperability is lock-in to the rest of us.
[link]
From: Bob Aman (Mar 18 2009, at 06:22)
Christopher,
YAML is nice, but JSON was absolutely the right choice here.
[link]
From: Robert Young (Mar 18 2009, at 06:56)
Fumble fingers, swap interoperability for integration. Sorry about that, Chief.
[link]
From: len (Mar 18 2009, at 07:40)
From my seat, the problem of cloud computing is the business problem. Service offerings coarsen (bigger bundles of services including stuff you don't need but have to pay for to get the stuff you do need). That is a major problem for application players. You are right about the competitive ecosystem because the only way application vendors will have any leverage is to negotiate new services whenever a supplier gets acquisitive.
As the cloud vendors lose ad revenue, they are quickly jacking up service prices. That's capitalism but unpredictable spikes in razor margins tend to push businesses out of the cloud.
The cloud turns the web from a TV antenna into cable TV. It only gets more expensive without improving what there is to watch. And that is one key of cloud computing: keeping the content on the same server system, eg, selling the geocoding and hits by allowing one to use the maps but never selling the maps.
And that is how Microsoft absorbed smaller fish and the pattern is repeating everywhere. We dress up the technology, tell them it will be 'better faster cheaper', and then the business decisions determine the reality based on access and possession.
Not exactly evolution.
I realize that contravenes current thinking but there it is.
[link]
From: Erik Engbrecht (Mar 18 2009, at 07:55)
I'm glad someone is having the sense to pursue this line of business, but don't underestimate the difficulty in keeping IT's hands out of it.
[link]
From: Mike Linksvayer (Mar 18 2009, at 08:00)
<blockquote>Maybe the single most interesting thing about this API is that the spec is published under a Creative Commons “Attribution” license, which means that pretty well anybody can do pretty well anything with it.</blockquote>
That's great, though ought to be qualified -- anyone can do pretty well anything with the spec text. There's every reason to permit that, but it's just one of potentially several things to do in order to ensure that a spec is really open to other implementations and unforseen uses. Some thoughts at What good is a CC licensed specification?
Putting a spec under a liberal CC license is a no-brainer and Sun did the right thing in doing that. But that act's impact without a bunch of other stuff in place shouldn't be overhyped.
[link]
From: Paul Hoffman (Mar 18 2009, at 08:16)
I'm not as cloud-happy as Tim, but I really like the open API aspect. http://lookit.typepad.com/lookit/2009/03/suns-cloud-has-one-interesting-part.html
[link]
From: Tim (but not THE Tim) (Mar 18 2009, at 13:40)
"don't underestimate the difficulty in keeping IT's hands out of it"
I am an 'IT guy' - from the comment I infer that you mean 'those guys who administer our data centers prevent us from trying any new technology'.
We're not all that draconian; often it's an organizational cultural problem with some people in IT having attitudes which aren't warranted.
That said, it will be interesting to see what happens in organizations if business offices _do_ each set up their own virtual data center. At our shop, and by that I mean don't generalize it to everyone, we have enough trouble with business offices creating their own applications and web servers - when it works, they're happy folk and loving their empowerment. When it breaks - then it's time to call 'the IT guys'. I am not against new things, I am against having to fix things that I didn't create, and would have created in a more robust manner had I been asked to create them. My emotional response is to say 'your code, your server, your problem' but the business units almost _always_ have the clout to say 'this is impactful to the business and must be fixed' which translates through our boss to 'learn it, figure out what's wrong with it, and fix it so you get this guy off my phone!'
[link]
From: Mark Nottingham (Mar 18 2009, at 17:35)
WebDAV? Hmm... Any details there about interop, what features are supported, etc.? In some ways, WebDAV is as complicated as SOAP...
[link]
From: Mike Seminaro (Mar 18 2009, at 20:29)
The API looks very clean, and the abstractions seem like a good choice to me. Just a couple comments/questions:
1) I understand how the approach of not committing to a particular URI pattern leaves a lot of flexibility in terms of future implementation. The trade off, IMO, is that it makes programming against the interface a little more difficult. Before you can do any operation you must first walk the high level resources to "seed" your API implementation with the function to URI mapping.
2) Was surprised to see the identity/auth method is simply basic auth. Not sure that's a bad thing, but I don't see many people doing that with their public APIs these days. The much more prevalent style is to issue API keys and have some portion of the requests signed so that the "password" never goes over the wire and that there is less exposure to man in the middle/replay attacks (usually a timestamp is part of the signature also). Would be interested in hearing what thought went into the choice of basic auth and whether or not this may potentially change before the API goes final.
[link]
From: Erik Engbrecht (Mar 19 2009, at 05:25)
@tim but not the tim
I'm not referring to the technologists in IT, they suffer right a long with the users, but as far as I can tell they are in a dwindling minority.
[link]
From: John Cowan (Mar 19 2009, at 06:37)
"Controller URI" is a very nice name. Sometimes a proper REST interface needs a little bag (of SOAP, as it were) on the side that you can poke to get things done. The Unix file API is pretty RESTlike (and the Plan 9 file API even more so; when Plan 9 says everything is a file, they really mean it, including the current sets of windows on your screen and buffers in your editor even), and then there's ioctl().
So, are you looking forward to painting yourself blue when you go into battle, in the manner of your Pictish ancestors?
[link]
From: Avi Flax (Mar 19 2009, at 13:23)
Tim: congrats on this, it's very impressive and exciting!
I'm a proponent of REST and I have a question about how you designed the API. I appreciate that there are pros and cons to different levels of RESTfulness, so this isn't a flame; I really am just curious: why did you use POST to URIs with verbs in them, instead of fully relying on clients sending representations of desired state to resources? I hope you don't mind sharing a few thoughts on how you arrived at this particular level of RESTfulness.
Thanks!
[link]
From: Erik Mogensen (Mar 19 2009, at 13:45)
First of all, kudos for choosing JSON based representations. All you need to do is document the object structure and what the structure means.
Two comments.
Sorry for blowing the "High REST" horn here... You use the term REST, yet you say that you have to POST to the "/ops/attach" to invoke the "attach" method. The net result is that some resource changes state. Why not just PUT the resource with the desired state?
To start a VM, I need to POST a "note" to ".../ops/start" to start the VM. Why not GET the state of the VM and modify { ... "state" : "stopped" ... } to "starting" and PUT it back?
It's wonderful that you're required to navigate the resources to find links to these switches, but it would be even better if you asked to modify those resources to manipulate state.
My real question is. Are these the types of things you would consider "fixing", in light of the spec not being finished?
It would be oh so wonderful for the community to help create an API that was truly in the spirit of REST, so that at least ONE API could wear the REST badge with pride.
Secondly, Hmm.. I forgot what the other thing was. Ho hum. I'd love to help, but it'd be on my (limited) spare time...
[link]
From: Bill de hÓra (Mar 19 2009, at 15:22)
@mnot: "WebDAV?"
Agree re complexity. Presumably it'll be whatever the Sun Storage 7000 series supports. It kind of makes sense to not require an intermediary mapping layer right now, but ultimately WebDAV doesn't quite fit for representing a data center.
If there was an extension for the controller uri idiom to drop into the service document AtomPub becomes an option.
Alternatively AtomPub could be ported to JSONPub - that would be a fun thing to do; if you sprinkled on forms for partial updates, that strikes me as having limitless adoption potential.
[link]
From: Keith (Mar 19 2009, at 22:14)
James Corvin:
"No SOAP-based API? Doesn't really seem ready for the enterprise then."
I disagree. Enterprise scale organizations can digest and use JSON, REST, and current standard technologies as easily as they can SOAP. There's no need for a hugely complicated implementation just to adhere to big-business standards.
Sun has a hugely forward-looking approach here, with true openness that will help enterprises adopt cloud technologies easier, with less lock-in and more consumability.
[link]
From: Roshan Sequeira (Mar 20 2009, at 07:42)
<quote>
...
I am 100% convinced that if there were a general-purpose platform for running behind the firewall to automate scaling and deployment and take IT out of many loops, there are a whole lot of enterprises who’d love that kind of elasticity in their infrastructure
...
</quote>
There is a platform, Appistry CloudIQ (http://www.appistry.com), that runs behind the firewall. And companies such as FedEx, Geo-Eye, etc. who use it, love it :)
In fact one of our customers has one guy who is a part-time system admin/ part-time developer managing 450 machines in their on-premise cloud running their mission critical code powered by Appistry's CloudIQ :)
I love the REST APIs you guys came up with.
The cloud area is getting better and better everyday
[link]
From: Seth W. Klein (Apr 07 2009, at 19:11)
"On top of that there’s a command-line interface suitable for shell-scripting, except for it emits JSON instead of classic-Unix lines-of-text. I wonder how that will work out?"
Right now....
The JSON query language seems to be Jaql. It requires Java (and possibly Hadoop) and has more in common with SQL than anything you'd traditionally see in a one-liner. One-liners, of course, are baby multi-liners and babies are important ;)
JsonPath expects to be included in JavaScript or PHP source, so it's not even trying. The same thing applies to JsonT except without the PHP.
In contrast, we have well understood methods for transforming classic Unix line and field formats. They, and the tools implementing them, have been around for literally decades and are very widely deployed (every Unix box, all OS X machines, and at least a couple options for Windows).
So right now JSON out of a shell tool is not so good.
More things like this will create pressure for development of tools to change that, but years of widespread XML/HTML deployment have only produced a few oddly maintained tools. Perhaps that's because you can scrape quite a bit of the web with a couple sed passes, and if I were to have to deal with the mentioned tools, that's probably the route I'd take.
[link]
From: http://cloudberrylab.com (Apr 09 2009, at 02:42)
CloudBerry Explorer is a freeware Windows Client for Amazon S3. We would like to support SUN storage API based on Amazon S3 API in our tool and we are wondering what the community thinks about it. Thanks!
[link]