Patterns for using Distributed Hash TablesConclusion

time to read 4 min | 665 words

Well, it looks like I finally had completed all I wanted to say about DHTs. I can now go back to talking about multi tenancy :-)

The previous ones are:

    We have gone over a lot of options for how to use a DHT / Distributed Cache. Some of them are for intellectual curiosity, some of them are of practical use. I urge you to remember that a DHT is not an RDBMS, and that both the access patterns and usage are different. Trying to force one into the other can be painful.

    In the first post in the series, I presented a list of common operations that all DHT supports:

    • PUT key, data, expiration
      Will fail if item is already in cache
    • GET key
      Will return null if item is not in cache or if expired
    • DEL key
      Delete the key from the cache
    • UPDATE key, data, version
      Update the item if the version matches

    There are a few things that I would consider important as well:

    • Batching support - the ability to send several items to storage in a node, as well as getting several items from a node. The later is usually supported, but the former is generally not.
    • Operations such as add_to_list, remove_from_list can make some operations much simpler if they are implemented by the cache rather than built on top of it.
    • Automatic recognition of common conventions:
      • "{Customer#1}_Orders", which can be translated by the client automatically to the item group name. If we enforce locality on the groups, we can even have the server resolve that for us, without having to pay the cost for the extra call. But this can be implemented on both server and client.
      • Automatic recognition of locality, so for the purpose of node matching, we will consider only the parts before the colon in a key. This way "foo:1" and "foo:2" will end up on the same node.

    Integrating indexed properties and indexed ranges are things that should happen at the client level, so your API will look something like:

    Put(item, x => x.Name, x => x.Age)
    Get<Foo>( x => x.Name, "bar")
    GetRange<Foo>( x => x.Name, "foo", "tada" )

    Those are just rough ideas, but they should tell you how to deal with this.

    Building a DHT is a really simple task, assuming that you go with the memcached model of double hashing, so adding support for such things isn't particularly hard. But do consider if this is the right solution or not in those scenario.

    Done, at last. Hope you liked the series.

    More posts in "Patterns for using Distributed Hash Tables" series:

    1. (09 Aug 2008) Conclusion
    2. (20 Jul 2008) Locking
    3. (20 Jul 2008) Groups