The Guts n’ Glory of Database InternalsThe communication protocol

time to read 4 min | 799 words

With all the focus on writing to disk and ensuring the high speed of the database, I almost missed a very important aspect of databases, how do we actually communicate with the outside world.

In many cases, there is no need for that. Many databases just run either embedded only (LevelDB, LMDB, etc) or just assume that this is someone else’s problem, pretty much. But as it turn out, if you aren’t an embedded database, there is quite a bit that you need to get right when you are working on the network code. For the rest of this post, I’m going to assume that we are talking about a database that clients connect to through the network. It can be either locally (shared memory / pipes / UNIX sockets / TCP) or remotely (pretty much only TCP at this point).

The actual protocol format don’t really matter. I’m going to assume that there is some sort of a streaming connection (think TCP socket) and leave it at that. But the details of how the database communicate are quite important.

In particular, what kind of communication do we have between client and server. Is each request independent of one another, are there conversations, what kind of guarantees do we give for the lifetime of the connection, etc?

Relational databases use a session protocol, in which the client will have a conversation with the server, composed of multiple separate request / replies. Sometimes you can tell the server “just process the requests as fast as you can, and send the replies as well, but don’t stop for sending replies”, which helps, but this is probably the worst possible way to handle this. The problem is that you need a connection per client, and those conversation can be pretty long, so you hold up to the connection for a fairly long duration, which means that you have to wait for the next part of the conversation to know what to do next.

A better approach is to dictate that all operations are strictly one way, so once the server has finished processing a request, it can send the reply to the client and move on to process the next request (from the same client or the other one) immediately. In this case, we have several options, we can pipeline multiple requests on the same channel (allowing out of order replies for requests / replies), we can use the serial nature of TCP connections and have a pool of connections that we can share among clients.

An even better method is if you can combine multiple requests from the server to a single reply from the server. This can be useful if you know that the shape of all the commands are going to be same (for example, if you are asked in insert a million records, you can process them as soon as possible, and only confirm that you accepted every ten thousands or so). It may sound silly, but those kind of things really add up, because the client can just pour the data in and the server will process it as soon as it can.

Other considerations that you have to take into account in the communication protocol is security and authentication. If you are allowing remote clients, then they need some way to authenticate. And that can be expensive (typically authentication required multiple round trips between server & client to verify the credentials, and potentially and out of server check in something like LDAP directory), so you want to authenticate once and then keep on to the same connection for as long as you can.

A more problematic issues comes when talking about the content of your conversation with the database, you typically don’t want to send everything over plain text, so you need to encrypt the channel, which has its own costs. For fun, this means dealing with things like certificates, SSL and pretty complex stuff that very few properly understand. If you can get away with letting something else do that for you, see if you can. Setting up a secured connection that cannot be easily bypassed (man in the middle, fake certs, just weak encryption, etc) is none trivial, and should be left for experts in their field.

Personally, I like to just go with HTTPS for pretty much all of that, because it take a lot of the complexity away, and it means that all the right choices has probably already been made for me. HTTPS, of course, is a request/reply, but things like connection pooling, keep-alive and pipelining can do quite a lot to mitigate that in the protocol level. And something like Web Sockets allows to have most of the benefits of TCP connections, with a lot of the complexity handled for you.

More posts in "The Guts n’ Glory of Database Internals" series:

  1. (08 Aug 2016) Early lock release
  2. (05 Aug 2016) Merging transactions
  3. (03 Aug 2016) Log shipping and point in time recovery
  4. (02 Aug 2016) What goes inside the transaction journal
  5. (18 Jul 2016) What the disk can do for you
  6. (15 Jul 2016) The curse of old age…
  7. (14 Jul 2016) Backup, restore and the environment…
  8. (11 Jul 2016) The communication protocol
  9. (08 Jul 2016) The enemy of thy database is…
  10. (07 Jul 2016) Writing to a data file
  11. (06 Jul 2016) Getting durable, faster
  12. (01 Jul 2016) Durability in the real world
  13. (30 Jun 2016) Understanding durability with hard disks
  14. (29 Jun 2016) Managing concurrency
  15. (28 Jun 2016) Managing records
  16. (16 Jun 2016) Seeing the forest for the trees
  17. (14 Jun 2016) B+Tree
  18. (09 Jun 2016) The LSM option
  19. (08 Jun 2016) Searching information and file format
  20. (07 Jun 2016) Persisting information