We didn’t plan to do a lot of changes on the client side for RavenDB 4.0. We want to do most changes on the server side, and just adapt the client when we are doing something differently.
However, we run into an interesting problem. A while ago we asked Idan, a new guy at the company, to write a Python client for RavenDB as part of the onboarding process. Unlike with the JVM client, we didn’t want to do an exact duplication, with just the changes to match the new platform. We basically told him, go ahead and do this. And he went, and he did. But along the way he got to choose his own approach for the implementation, and he didn’t copy the same internal architecture. The end result is that the Python client is significantly simpler than the C# one.
That has a few reasons. To start with, the Python client doesn’t need to implement features such as async or Linq. But for the most part, it is because Idan was able to look at the entire system, grasp it, and then just implement the whole thing in one go.
The RavenDB network layer on the client side has gone through many changes over the years. In particular, we have the following concerns handled in this layer:
- Actual sending of requests to the server.
- High availability, Failover, load balancing and SLA matching.
- Authentication of the requests
- Caching server responses.
I’m probably ignoring a few that snuck in there, but those should be the main ones. The primary responsibility of the network layer in RavenDB is to send requests to the server (with all the concerns mentioned above) and give us the JSON object back.
The problem is that right now we have those responsibilities scattered all over the place. Each was added at a different time, and using several different ways, and the handling is done in very different locations in the code. This leads to complexity, and seeing everything in one place in the Python client is a great motivation to simplify things. So we are going to do that.
Having a single location where all of those concerned will make things simpler for us. But we can do better. Instead of just returning a JSON object, we can solve a few additional issues. Performance, stability and ease of deployment.
We can do that by removing the the internal dependency on JSON.Net. A while ago we got tired from conflicting JSON.Net versions, and we decided to just internalize it in our code. That led to much simpler deployment life, but does add some level of complexity if you want to customize your entities over the wire and in RavenDB (because you have to use our own copy of JSON.Net for that). And it complicates getting new JSON.Net updates, which we want.
So we are going to do something quite interesting here. We are going to drop the dependency on JSON.Net entirely. We already have a JSON parser, and one that is extremely efficient, using the blittable format. More importantly, it can writes directly to native code. That give us a very important property, it make it very easy to store the data, already parsed, with a well known size. Let me try to explain my reasoning.
A very important feature of RavenDB is the idea that it can cache requests for you. When you make a request to the server, the server returns the reply as well as an etag. Next time you make this request, you’ll use the same etag, and if the response hasn’t changed, the server can just tell you to use the value in the cache. Right now we are storing the full string of the request in the cache. That lead to a few issues, in particular, while we saved on the I/O of sending the request, we still need to parse the JSON, and we need to keep relatively large (sometimes very large) .NET strings around for a long time.
But if we use blittable data in the client cache, then we have no need to do additional parsing, and we don’t need to involve the GC on the client side to collect things, we can manage that explicitly and directly. So the 2nd time you make a request, we’ll hit the server, learn that the value is still relevant, and then just use it.
That is great for us. Except for one thing. JSON.Net is doing really great job in translating between JSON and .NET objects, and that isn’t really something that we want to do. Instead, we are going to handle blittable objects throughout the RavenDB codebase. We don’t actually need to deal with the user .NET’s types until the very top layers of the system. And we can handle that by having a convention, that will call into JSON.Net (the only place that will happen) and translate .NET objects to JSON and back. The nice thing about it is that since this is just a single location where this happens, we can do that dynamically, without having a hard dependency on JSON.Net version.
That, in turn, also expose the ability to do JSON <—> .Net objects using other parsers, such as Jil, for example, or whatever the end user decides.
More posts in "The design of RavenDB 4.0" series:
- (26 May 2016) The client side
- (24 May 2016) Replication from server side
- (20 May 2016) Getting RavenDB running on Linux
- (18 May 2016) The cost of Load Document in indexing
- (16 May 2016) You can’t see the map/reduce from all the trees
- (12 May 2016) Separation of indexes and documents
- (10 May 2016) Voron has a one track mind
- (05 May 2016) Physically segregating collections
- (03 May 2016) Making Lucene reliable
- (28 Apr 2016) The implications of the blittable format
- (26 Apr 2016) Voron takes flight
- (22 Apr 2016) Over the wire protocol
- (20 Apr 2016) We already got stuff out there
- (18 Apr 2016) The general idea