Avoid a standalone DbService process

time to read 3 min | 520 words

imageThe trigger for this post is the following question in the RavenDB mailing list. Basically, given a system that is composed of multiple services (running as separate processes), the question is whatever have each service use its own DocumentStore or have a separate service (DbService) process that will encapsulate all access to RavenDB. The idea, as I understand it, is to avoid the DocumentStore creation because it is expensive.

The quick answer here is simple: <blink*>Don’t ever do that!</blink>

* Yes, I’m old.

That is all, you don’t need to read the rest of this post.

Oh, you are still here, as long as you are here, let me explain my reasoning for such a reaction.

DocumentStore isn’t actually expensive to create. In fact, for most purposes, it is actually quite cheap. It holds no network resources on its own (connection pooling is handled by a global pool, anyway). All it does is manage the http cache on the client, cache things like serialization information, etc.

The reason we recommend that you won’t create document stores all the time is that we saw people creating a document store for the purpose of using a single session and then disposing it. That is quite wasteful, it forces us to allocate more memory and avoid the use of caching entirely. But creating a few document stores for each service that you have? That is cheap to do.

What really triggered this post is the idea of having a separate process just to host the DocumentStore, the DbService process. This is a bad idea. Let me count the ways.

Your service process needs some data, so it will go to the DbService (over HTTP, probably) and ask for it. Your DbService will then call to RavenDB to get the data using the normal session and return the data to the original service. That service will process the data, maybe mutate it and save it back. It will have to do that by sending the data back to the DbService process, which will create a new session and save it to RavenDB.

This is adding another round trip to every database query, it means that you can’t natively express queries inside your service (since you need to send it to the DbService). It creates strong ties between all the services you have the the DbService, as well as a single point of failure. Even if you have multiple copies of DbService, you now need to write the code to do automatic failover between them. Updating a field in a class for one service means that you have to deploy the DbService to recognize the new field, for example.

In terms of client code, aside from having to write awkward queries, you also need to deal with serialization costs, and you have to write your own logic for change tracking, unit of work, etc.

In other words, this has all the disadvantages of a repository pattern with the added benefit of making many remote calls and seriously complicating deployment.