Monday, January 5, 2009

Method calls should be immediate

There's a famous Note on Distributed Computing, by Jim Waldo, et al, arguing that the procedure calls of a language should not be made transparently remote. It's a story that has repeated itself over the years: code starts appearing on multiple networked machines, people want to abstract away from the network, and so they add transparent procedure calls of some kind. To my knowledge it has always gone badly, with the possible exception of cluster computing. Waldo's famous Note lists four killer differences between local and remote calls: latency, memory access, partial failure, and concurrency. I'm convinced.

When it comes to code splitting, Waldo's Note is being challenged again. Doloto and other AJAX tools take the approach where every method call potentially calls into non-loaded code. When that happens, the method call blocks until more code is loaded.

This scenario is not exactly the same as a transparent remote call. Memory access is no longer an issue, because all data of the computation is stored on a single computer. Partial failure could arguably be avoided by deciding all failures will cause the app to shut down. What about latency, though?

Latency looks like a very hard problem for the approach. People developing web applications try to make their sites start up with a minimum of round trips. They aim for numbers like one round trip, or two round trips. To actually achieve such low latencies, programmers must be thoroughly in control of where delays happen, not have those delays happen at any old method call. Further, programmers would like to give some feedback to the user while a download is happening. How can they implement this if any method call might block for more downloading? If execution is blocked, the program can't possibly execute more code to put up a feedback message.

The challenges are severe, so I look forward to seeing how these systems address them.

For the Google Web Toolkit, we are trying a different approach. Regular method calls stay as normal and run immediately. However, wherever the programmer explicitly specifies a split point, the compiler is allowed to arrange for code to download later. A split point looks like this:

public void onComposeMailButtonClicked() {
GWT.runAsync(new RunAsyncCallback() {
public void onSuccess() {
activateComposeMailView();
}

public void onFailure(Exception e) {
Window.alert("Server cannot be reached.");
}
});
}

There is no mistaking this for a regular method call! Notice that it looks just like passing an event handler into a GUI framework such as Swing. The event handler is specified as an anonymous inner class. In this case there are two methods on the event handler, one called once the code is downloaded, and one called in case there is any network failure. Note that the latter means partial failure is still supported. You can design an application to keep running but with reduced functionality.

With this arrangement, programmers know exactly where a network download can occur and thus can design a loading pattern that will make their application start quickly. Just as importantly, though, regular method calls remain regular method calls, and programmers don't have to worry about extra network activity or failure conditions.

No comments: