Calliope Sounds: Streaming content for non-media applications

The vast majority of web app and mobile app use JSON or XML on the wire to transfer data. It is relatively easy to design a utility schema and textual data on the wire is easier to debug. The schema designs, however, tend to emphasize an analytical perspective rather than a performance one. For example, the schema for hierarchal data will typically look like this

node     := data children
children := node*
data     := field+
field    := name value
name     := string
value    := string

The problem with this design is that it you can only know about the structure of the data only after reading the whole document. You can not do anything for the user between making the request for data and getting the last byte of the response.

I don't know about you, but I hate waiting for all the data when I don't want it. When I am in browsing mode or manual indexing mode [*] I only want to get the gist and move along quickly. For example, if I am flipping through a list of the movies playing at the local cinema I don't need an image of the movie's poster. What I do need is to know that there are 6 movies, the 6 titles, and the next two showings of each movie. When I settle on a movie then I might be interested in seeing the promotional poster, trailer, and reviews and so am willing to wait for this data to fill-into the UI. (Don't have me press a "more details" button, please!)

Most schema designers will create something like this

movies        := (movie)*
movie         := title rating director casting 
                 poster-url trailer-url movie-reviews 
                 showtimes
movie-reviews := star-rating reviews
director      := person
casting       := person part
person        := string
part          := string
showtimes     := showtime+
showtime      := start-time end-time
start-time    := number ':' number
reviews       := review*
review        := person star-rating comment-string
...

You get the idea: A straight forward and clear hierarchical organization of the data. The problem with this design is that for web apps or mobile apps the app needs to read far more bytes than it needs to enable browsing and manual indexing. It needs to read all the bytes up to the last movie's node first byte just to know that there are 6 movies. That could be several thousand bytes of data just to know the number 6.

Network performance is not instantaneous. Cell network performance is worse than network performance. Schema designers need to stop designing data structures as though the data will be instantaneously transferred from the host to the client. We don't do this for audio or video and we should not do it with other data either. Data on the wire should be designed more as packets to be explicitly organized. And it should be ordered so as to ensure the application's usability remains high even under poor network throughput conditions.

In the case of the movie listings, the following is a better data structure

movies      := count id* ( title | 
                           showtime | 
                           poster-url | 
                           trailer-url | 
                           ... )*
title       := id 'title' string
showtime    := id 'showtime' timestamp
poster-url  := id 'poster-url' url
trailer-url := id 'trailer-url' url
review      := ...

Here the related data is explicitly linked together with ids, tags, and values rather than implicitly by position within a hierarchy. Next, order the transfer of data to the client such that the client can use progressive disclosure. The user will now be able to see some structure and some content and so be able to perform some interaction with the application before the whole structure and all the content is delivered.

In the case of the movie listings it might look like

6 m1 m2 m3 m4 m5 m6 ;
# now get the titles and next showtimes for the movies
m1 title Inception ;
m1 showtime 6:30 ; 
m1 showtime 3:10 ; 
m2 title The Sorcerer's Apprentice ;
m2 showtime  4:50 PM ;
m2 showtime  7:20 PM ;
m3 title Despicable Me ;
m3 showtime ... 
m3 showtime ...
m4 title ... ;
m3 showtime ... 
m3 showtime ...
m5 title ... ;
m3 showtime ... 
m3 showtime ...
m6 title The Last Airbender  ;
m3 showtime ... 
m3 showtime ...
# now get the remaining showtimes
m1 showtime 12:00 ; 
m1 showtime 9:40 ;
m2 showtime 11:45 AM ;
m2 showtime 2:15 PM ;
m2 showtime 10:00 PM ;
...
# now get the star-ratings for the movies
...
# etc

Here I am using a semi-colon delimited encoding to simplify this example. There is no reason to abandon using JSON or XML as the encoding. You must, however, use a streaming JSON or XML parser so that the application client is able to get at the data as soon as it arrives.

This lesson was taught to me by David Durand when we were working on MAPA at Dynamic Diagrams. A visualization within MAPA was a Java applet that displayed a map of locations within a web site. The maps needed to display the canonical path to the location, the local structure of the site around the location, and the kinds of pages there. The data structure we used to communicate the data needed to display the map had the structural data up front so the applet could start rendering the map (in another thread). Meanwhile, the details about each page came trickling in and could be rendered piecemeal to the existing skeletal frame as it arrived. (David wanted me to use fixed-width data but I could not bring myself to do that.)

It is only now that I more regularly use mobile apps that I really want his lesson to be spread as far as possible because mobile app usability in conjunction with cell network activity just plain sucks. I don't think the primary reason for this condition has to do with hardware, or the network, and, probably, not the algorithms. That leaves the data.

[*] Manual indexing is when you have a list of 10 items and you flip through them until you find the one item that you want. I am sure there is a interaction-design term for this but I don't know what it is.