Structuring your app with Units

A major component of our design for a message-passing framework is the idea of a Unit. A unit in our language is an aggregation of data and code that can receive and send messages. It's analogous to an object in object-oriented languages.

Note that I said analogous. They're defined similarly to objects in that you specify fields and functionality, but you have to treat units differently from objects because of their runtime characteristics. Specifically, at most one message that uses the state of a particular unit can execute at once. This has extremely interesting implications.

For one thing, units' state is consistent between messages. What do I mean by that? Well, contrast it with most shared memory systems, where if two threads are trying to increment a shared counter, one thread might interleave one read operation between the other thread's read, add-one, and write sequence. If this interleaving occurs, then the system is brought to an "inconsistent" state. We obviate synchronization problems like this one -- the analogous problem is that an incrementer unit receives two simultaneous messages to "increment yourself", which is defined as read, add-one, and write. However, the runtime ensures that the two "simultaneous" messages are actually executed in sequence, so the problem above never occurs.

When I say that only one message on a unit can execute at once, I actually mean something a bit different: only one message appears to execute. In fact, the runtime might be executing many messages, but they will never notice. Another way to put it is that the runtime enforces that messages are transactional upon the state of the unit: messages appear (to observers -- other messages) to happen all at once and in isolation from other messages.

These transactional semantics have great implications to the design of systems. Traditional object oriented programming requires the programmer to break up their code by functionality. While designing a Web server, for example, a reasonable object-oriented design would include a module that listens on the socket and provides events, one that parses the HTTP protocol, one that looks an object up in the VFS based on the GET string, and one that spews the object back to the user in the HTTP protocol across the socket. Note how this is segmented by functionality -- take a moment to think about what features of the code are required at each level. Now think about how these modules communicate. The event comes in. That code (the event handler) probably calls a parseHTTP method to achieve an HTTPMessage structure. Then the event handler passes that to the parseGET method, which calls the vfsLookup at each level of the GET string until it reaches a terminal object. It returns the object. Lastly, the event handler passes the GET string and any query options, post data, and the socket descriptor to the object and tells it to handle the request and send the user the data. For file objects, the object just sends the mime type and the data. For other objects, some special code might be run.

In our system, you can organize your functionality in this way but it's more important to think about the data. What data is involved in handling a web request? Clearly each web request is a separate event to the server, and they have no shared state. So the incoming socket listener is probably bound to a "free operation" -- a message with no unit attached. This free op accepts data on the socket, parses it (using an HTTP parser library), parses the GET string into components (using another library), and sends these components to a VFS unit which knows how to do the lookups. This unit may just be a front-end to a tree of separate VFSDirectory units. Anyway, once the path components are resolved into a target object (itself another unit), the object is passed a message with the correct arguments as well as the capability of writing to the socket (derived from the event) and is expected to write its output to the socket in the same way.

So what's different? Units make the user focus at first on the data, not on the code. Only after the system is correctly separated into units do we think about organization of functionality, which is nearly orthogonal to the question of how data is broken into units.

Code and more case studies to follow.