Difference between revisions of "Distributed Tile Caching"

Latest revision as of 00:37, 12 November 2006

The Problem

Existing WMS servers often represent a single point of failure for access to valuable public datasets. Even when public WMS servers continue to function, peaks in bandwidth or processing loads can slow access to a crawl. A distributed cache of WMS-delivered imagery might ameliorate some of these issues.

Peer-to-Peer Tile Sharing

One possible way to address the brittleness engendered by traditional service architectures may be found in distributed hash tables, or DHTs, such as Kademlia, or Chord, wherein a data value is stored in and retrived from the network via a corresponding key value. DHTs typically offer redundant, decentralized network data storage and retrieval. Most DHTs are designed to degrade gracefully in the face of unpredictable network outages.

Here we examine the idea of building a distributed hash table for caching some or all of any OGC WMS layer around the Internet, using a system based on the Kademlia protocol. A cursory understanding of Kademlia is recommended as a basis for approaching this proposal. Additionally, it should be assumed that every cache peer *also* implements a WMS proxy front-end, which would serve as the application interface to the cache.

Spherical Key Space

In order for the caching to be effective, we must introduce the constraint that only pre-defined raster tile sizes, extents, and scales may be requested through the distributed cache. If the tile size is fixed, then each tile may be referred to by its center point and scale.

Now, the geographic nature of the data set being cached immediately suggests that, rather than use a cryptographic hashing function to generate data keys, the keys should instead correspond directly to the extents of the image being stored. In one general set of cases, where the raster layer to be cached is referenced to a geographic coordinate system, the keys of our network might then be tuples of the form (longitude, latitude, scale, [layer name]).

Similarly, when joining a layer cache, each peer would assume a notional "location", randomly selected from within the geographic extents of the layer. (In certain cases -- e.g. Landsat-7 composites -- it might make sense to further constrain locations to areas of actual coverage within layer's extents.) A peer's "address" within the network would be a tuple of the form (longitude, latitude, IP address, port).

Since we have redefined the notion of "key" within our Kademlia-like DHT, we must also redefine the notion of "distance" between two keys or two nodes. Great circle distance is one possibility; Euclidian distance through the spheroid between two points on its surface is another, and one that might be cheaper to calculate. It might make sense to evaluate distance internally in, say, radians, rather than kilometers or degrees.

Storing, Replicating, and Expiring Tiles

A new layer cache could be seeded by creating and distributing a metadata description file which along the lines of trackerless BitTorrent torrent files. The metadata description should include URLs for the originating WMS server(s), layer name(s), coordinate reference systems, layer extents, scale levels, tile size in pixels, expiration times, IP addresses for at least one gatekeeper node, and values for k, alpha, and the recommended number of k-buckets.

The layer cache might then operate as a Kademlia network in most relevant respects, except the way in which new tiles enter the network. Consider the case where a peer requests a tile that is not already cached in the network. Eventually, the request will reach a target peer that considers itself to be "closest" among its known peers to the tile's center. If that peer does not have a copy of the requested tile in its local cache, then and only then will it directly request the tile from the layer's originating WMS. After returning the fetched tile back to the requesting peer, the target peer would then treat this event as if it were a STORE message in the Kademlia protocol for that tile, and proceed to replicate the newly fetched tile to its alpha nearest neighbors.

Given the assumption that smaller scale overviews are likely to be requested more often than larger scale tiles, we might seek to cache small scale overviews more widely across the network. For example, a peer might recursively forward (rather than redirect) any request for a tile which contains the location of the peer itself, and then store and even replicate the tile once received.

The expiration of tiles from the cache is a matter worth particular consideration. Certainly, for some WMS data sources, such as OpenStreetMap, a fairly short expiration time would be sensible. For others, like LandSat-7, the expiration time could probably safely be measured in weeks or months, if ever. In the latter case, it might make sense to adopt a maximum local cache size in megabytes or gigabytes as a per-peer configuration parameter, and use an LRU strategy to keep the peer's local tile cache below that size.

Pros and Cons of Key Clustering

One advantage of using geographic coordinates as the basis for cache keys is that the tiles for a given region would tend to be stored in a fairly constrained cluster of peers within the network. If we suppose that many, if not most, of the use cases for WMS would involve a sequence of requests for spatially contiguous tiles, then this clustering property would in turn tend to lower the cache response latency (i.e. the time from asking the distributed cache for a time and actually receiving the tile) for all but the very first request for that region from any given peer. Since many of these use cases would be in real time -- e.g. via WorldWind -- it is imperative that cache response latency be as low as possible.

The downside of this clustering property is that hot spots might form in parts of the network that correspond to transient spikes in the popularity of certain areas, as might happen during an ongoing newsworthy event. Hopefully, the cache's replication regime can be augmented with some heuristics or strategies for handling this pathological case, but this is an issue demanding careful evaluation in the near future.

Other Theoretical Considerations

One feature of some DHTs touted by Kademlia (and Chord) is unidirectionality. We seem to have lost it with this design. What implications does that have for the effectiveness of tile replication?

Are the properties of this network provable? Does it matter? Help.

Yet another consequence of redefining the hash key as a coordinate tuple is that the number of k-buckets is no longer an intrinsic property of the network, and must be set a priori. Very probably an optimal number of k-buckets per peer could be estimated from the total number of tiles in the layer. As an aside, Mikel Maron has suggested the following formula to estimate the maximum number of tiles in a layer, across all scale levels, given the layer's resolution:

N = 2z(180/r)^2

... where r is the number of degrees per tile at max resolution, and z is the asymptote of the geometric series over the ratio between scale levels. In the most common case, each scale level would be half that of the next higher scale level, and hence would have one fourth as many tiles, yielding a value for z = 1 / (1 - 1/4) = 4/3.

Taking 15m pan-sharpened Landsat-7 composites as an example, at a tile size of 512 x 512 pixels, each tile would be about 7,680 meters on a side, or about .0625 degrees across. Plugging in the other values, we get a maximum of 22,118,400 tiles in the layer. This demands a much smaller key space than SHA-1, to be sure! The log to base 2 of 22,118,400 is about 24. Does this suggest a possible maximum useful number of k-buckets (in contrast to the value of 160 used by Kademlia)?

Other Practical Considerations

Is the network latency going to be too high to be practical? This seems to be the experience of the World Wind community in attempting to use Coral Cache to serve World Wind tiles.

A further constraint of our cache might be that the selection of tile size in pixels (as well as image compression) should be selected to keep the maximum tile size in bytes below about 60kb, since the maximum packet size in the UDP protocol used underlyingly by all Kademlia implementations is only 64kb.

How to implement a first cut? Prototype in Python perhaps? Use the original Khashmir, which has gone unmaintained for two years? Use the new version in BitTorrent 4.1 and up? Write our own?

We should keep in mind that C# would make a fantastic target language, as then it might be easier to piggy-back on the WorldWind platform. IronPython might offer a migration path for a Python prototype.

Finally, this distributed caching WMS proxy network needs a catchy name.

We've designed a Distributed Tile Caching Model that discards Kademlia, retains the circular hash concept, and introduces a directory service.

Contributors

This idea was originally put forward with input from Chris Holmes, Mikel Maron, Jo Walsh, Norm Vine, and others.