Ayende @ Rahien

Hi!
My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by email or phone:

ayende@ayende.com

+972 52-548-6969

, @ Q c

Posts: 6,481 | Comments: 47,780

filter by tags archive

reEntity Framework Core performance tuning–Part II

time to read 4 min | 701 words

After looking at this post detailing how to optimize data queries in EF Core, I obviously decided that I need to test how RavenDB handles the same load.

To make things fair, I tested this on my laptop, running on battery mode. The size of the data isn’t that much, only 100,000 books and half a million reviews, so I decided to increase that by an order of magnitude.

image

The actual queries we make from the application are pretty simple and static. We can sort by votes / publication date / price (ascending /descending) and we can filter by number of votes and the publication year.

image

This means that we don’t have an explosion of querying options, so that simplify the kind of work we are doing. To make things simple for myself, I kept the same model of books / authors and reviews as separate collections. This isn’t the best model for document database, but it allows us to compare apples to apples against the work the EF Core based solution and the RavenDB solution need to do.

A major cost in Jon’s solution is the need to aggregate the reviews for a book (so the average for the review can be computed). In the end, the only way to get the solution required was to just manually calculate the average reviews for each book and store the computation in the book. We’ll discuss this a bit more in a few minutes, for now, I want to turn our eyes toward the simplest possible query in this page, getting 100 books sorted by the book id.

Because we aren’t running on the same machine, it is hard to make direct parallels, but on Jon’s machine he got 80 ms for this kind of query on 100,000 books. When increasing the data to half a million  books, the query time rose to 150ms. Running the same query gives us the results instantly (zero ms). Querying and sorting by the title, for example, give us the results in 19 ms for a page size of 100 books.

Now, let us look at the major complexity for this system, sorting and filtering by the number of votes in the system. This is hard because the reviews are stored separately from the books. With EF Core, there is the need to join between the tables, which is quite expensive and eventually led Jon to take upon himself the task of manually maintaining the values. With RavenDB, we can use a map/reduce index to handle this all for us. More specifically, we are going to use a multi map/reduce index.

Here is what the index definition looks like:

image

We map the results from both the Books and the BookReviews into the same shape, and then reduce them together into the final output, which contains the relevant aggregation.

Now, let us do some queries, shall we? Here is us querying over the entire dataset (an order of magnitude higher than the EF Core sample set), filtering by the published date and ordering by the computed votes average. In here, we get the first 100 items, and you can see that we got over 289,753 total results:

image

One very interesting feature of this query is that we are asking to include the book document for the results. This is handled after the query (so no need to do a join to the entire 289K+ results), and we are able to get everything we want in a very simple fashion.

Oh, and the total time? 17 ms. Compared to the 80ms result for EF with 1/10 of the data size. That is pretty nice (and yes, different machines, hard to compare, etc).

I’ll probably have another post on this topic, showing off some of the cool things that you can do with RavenDB and queries.

RavenDB 4.0 Unsung HeroesThe indexing threads

time to read 3 min | 473 words

wire-33134_640A major goal in RavenDB 4.0 is to eliminate as much as possible complexity from the codebase. One of the ways we did that is to simplify thread management. In RavenDB 3.0 we used the .NET thread pool and in RavenDB 3.5 we implemented our own thread pool to optimize indexing based on our understanding of how indexing are used. This works, is quite fast and handles things nicely as long as everything works. When things stop working, we get into a whole different story.

A slow index can impact the entire system, for example, so we had to write code to handle that, and noisy indexing neighbors can impact overall indexing performance  and tracking costs when the indexing work is interleaved is anything but trivial. And all the indexing code must be thread safe, of course.

Because of that, we decided we are going to dramatically simplify our lives. An index is going to use a single dedicated thread, always. That means that each index gets their own thread and are only able to interfere with their own work. It also means that we can have much better tracking of what is going on in the system. Here are some stats from the live system.

image

And here is another:

image

What this means is that we have fantastically detailed view of what each index is doing, in terms of CPU, memory and even I/O utilization is needed. We can also now define fine grained priorities for each index:

image

The indexing code itself can now assume that it single threaded, which free a lot of complications and in general make things easier to follow.

There is the worry that a user might want to run 100 indexes per database and 100 databases on the same server, resulting in a thousand of indexing threads. But given that this is not a recommended configuration and given that we tested it and it works (not ideal and not fun, but works), I’m fine with this, especially given the other alternative that we have today, that all these indexes will fight over the same limited number of threads and stall indexing globally.

The end result is that thread per index allow us to have fine grained control over the indexing priorities, account for memory and CPU costs as well simplify the code and improve the overall performance significantly. A win all around, in my book.

0.1x or 10x, time matters

time to read 3 min | 513 words

clock-150754_640There is a lot of chatter in the industry about the notion of 10x programmers. People who can routinely be an order of magnitude faster than mere mortals.  Okay, that was a bit snarky, I’ll admit.

I have had the pleasure to interact with a lot of developers, from people whose conversation I could barely follow (way above my level) and whose code I mined for insight and ideas to the the classic outsourcing developer who setup a conference call for assistance in writing “Hello World” to the console. I think that I have enough experience at this point to comment on the nature of developer productivity. More to the point, I know of quite a lot of way to destroy it.

The whole 10x developer mentality assume that a single (or very few) developers are actually able to make a major difference, and that is usually not the case. Let me try to explain why, and note that I assume a perfect world in which there no need to burn this 10x dev with all nighters and hero mode.

The problem is what we are talking about when we are talking about major difference. Usually, as developers, we can talk about making major technical changes. Let us consider the Windows Kernel Dispatcher Lock removal. That was 8 years ago and it is still something that pop to my mind when I consider big changes in the guts of software. This is something that is clearly beneficial, was quite complex to get right and require a lot of work. No idea if the people working on it were “10x” but I assume that the kernel team in Microsoft weren’t pulled from the lowest bidder by the Shady Outsourcing R Us.

What real difference did it make for Windows? Well, it became faster, which is great. But I think it is fair to say that most people never heard about it, and of those who did, fewer cared.

The things that really matter for a product are a solid technical base, and then all the rest of the stuff. This can be the user interface, the documentation, the getting started guide and even the “yes dear” installer. It is the whole experience that matters, and you’ll not typically find a who can do all of that significantly better than others.

One of the guys in the office is currently spending much more time writing the documentation and walkthrough for a feature than the time it took to actually write it. The problem with developers is that we tend to live in our own world and consider everything else that isn’t technical secondary.

agenda-153555_640

But as good as the software is, the actual release to customers require a lot more work that isn’t even remotely technical, be it marketing materials, working with partners or just making sure that the purchase workflow actually work.

Writing SSL ProxyPart II, delegating authentication

time to read 4 min | 681 words

I mentioned that RavenDB 4.0 uses x509 client certificate for authentication, right? As it turns out, this can create some issues for us when we need to do more than just blind routing to the right location.

Imagine that our proxy is setup in front of the RavenDB Docker Swarm not just for handling routing but to also apply some sort of business logic. It can be that you want to do billing per client basis on their usage, or maybe even inspect the incoming data into RavenDB to protect against XSS (don’t ask, please). But those are strange requirements. Let us go with something that you can probably emphasize with more easily.

We want to have a RavenDB cluster that uses a Let’s Encrypt certificate, but that certificate has a very short life time, typically around 3 months. So you probably don’t want to setup these certificates within RavenDB itself, because you’ll be replacing them all the time. So we want to write a proxy that would handle the entire process of fetching, renewing and managing Let’s Encrypt certificates for our database, but the certificates that the RavenDB cluster will use are internal ones, with much longer expiration times.

So far, so good. Except…

The problem that we have here is that here we have a problem. Previously, we used the SNI extension in our proxy to know where we are going to route the connection, but now we have different certificates for the proxy and for the RavenDB server. This means that if we’ll try to just pass the connection through to the RavenDB node, the client will detect that it isn’t using a trusted certificate and fail the request. On the other hand, if we terminated the SSL connection at the proxy, we have another issue, we use x509 client certificate for ensuring that the user actually have the access they desire. And we can’t just pass the client certificate forward, since we terminated the SSL connection.

Luckily, we don’t have to deal with a true man in the middle simulation here, because we can configure the RavenDB server to trust the proxy. All we are left now is to figure out how the proxy can tell the RavenDB server what is the client certificate that the proxy authenticated. A common way to do that is to send the client certificate details over in a header, and that would work, but…

Sending the certificate details in a header has two issues for us. First, it would mean that we need to actually parse and mutate the incoming data stream. Not that big a deal, but it is something that I would like to avoid if possible. Second, and more crucial for us, we don’t want to have to validate the certificate on each and every request. What we want to do is take advantage on the fact that connections are reused and do all the authentication checks once, when the client connects to the server. Authentication doesn’t cost too much, but when you are aiming at tens and thousands of requests a second, you want to reduce costs as much as possible.

So we have two problems, but we can solve them together. Given that the RavenDB server can be configured to trust the proxy, we are going to do the following. Terminate the SSL connection at the proxy, and validate the client certificate (just validate the certificate, not check permissions or such) and then the magic happens. The proxy will generate a new certificate, signed with the proxy own key and registering the original client certificate thumbprint in the new client certificate (caching that certificate, obviously). Then the proxy route the request to its destination, signed with its own client certificate. The RavenDB server will recognize that this is a proxied certificate, pull the original certificate thumbprint from the proxied client certificate and use that to verify the permissions to assign to the user.

The proxy can then manage things like refreshing the certificates from Let’s Encrypt and RavenDB can get proxied requests.

Writing SSL ProxyPart I, routing

time to read 4 min | 661 words

RavenDB 4.0 uses x509 client certificates for authentication. That is good, because it means that we get both encryption and authentication on both ends, but it does make is more complex to handle some deployment scenarios. It turns out that there is quite a big demand for doing things to the data that goes to and from RavenDB.

We’ll start with the simpler case, of having dynamic deployment on Docker, with nodes that may be moved from location to location. Instead of exposing the nastiness of the internal network to the outside world with URLs such as (https://129-123-312-1.rvn-srv.local:59421) we want to have nice and clean urls such as https://orders.rvn.cluster. The problem is that in order to do that, we need to put a proxy in place.

That is pretty easy when you deal with HTTP or plain TCP, but much harder when you deal with HTTPS and TLS because you also need to handle the encrypted stream. We looked at various options, such as Ngnix and Traefik as well as a peek at Squid but we rule them out for various reasons, mostly related to the deployment pattern (Ngnix doesn’t handle dynamic routing), feature set (Traefik doesn’t handle client certificates properly) and usecase (Squid seems to be much more focused on being a cache). All of them didn’t support the proper networking model we want (1:1 connection matches from client to server, which we would really like to preserve because it simplify authentication costs significantly).

So I set out to explore what it would take to build an SSL Proxy to fit our needs. The first thing I looked at was how to handle routing. Given a user that type https://orders.rvn.cluster in the browser, how does this translate to actually hitting an internal Docker instance with a totally different port and host?

The answer, as it turned out, is that this is not a new problem. One of the ways to do that is to just intercept the traffic. We can do that because in this deployment model, we control both the proxy and the server, so we can put the certificate fro “orders.rvn.cluster” in the proxy, decrypt the traffic and then forward it to the right location. That works, but it means that we have a man in the middle. Is there another option?

As it turns out, this is such a common problem that there are multiple solutions for it. These are SNI (Server Name Indication) and ALPN (Application Layer Protocol Negotiation), both of which allow the client to specify what they want to get from the server as part of the initial (and unencrypted) negotiation. This is pretty sweet from the point of view of the proxy, because it can make routing decisions without needing to do the TLS negotiation but not so much for the user if they are currently trying to check “super-shady.site”, since while the contents of their request is masked, the destination is not. I’m not sure how big of a security problem this is (the end IP isn’t encrypted, after all, and even if you host a thousands sites on the same server, it isn’t that big a deal to narrow it down).

Anyway, the key here is that this is possible, so let’s make this happen. The solution is almost literally pulled from the StreamExtended readme page.

We get a TCP stream from a client, and we peek into it to read the TLS header, at which point we can pull the server name out. At this point, you’ll note, we haven’t touched SSL and we can forward the stream toward its destination without needing to inspect any other content, just carrying the raw bytes.

This is great, because it means that things like client authentication can just work and authenticate against the final server without any complexity. But it can be a problem if we actually need to do something with the traffic. I’ll discuss how to handle this properly in the next post.

RavenDB 4.0 Unsung HeroesField compression

time to read 3 min | 498 words

I have been talking a lot about major features and making things visible and all sort of really cool things. What I haven’t been talking about is a lot of the work that has gone into the backend and all the stuff that isn’t sexy and bright. You probably don’t really care how the piping system in your house work, at least until the toilet doesn’t flush. A lot of the things that we did with RavenDB 4.0 is to look at all the pain points that we have run into and try to resolve them. This series of posts is meant to expose some of these hidden features. If we did our job right, you will never even know that these features exists, they are that good.

In RavenDB 3.x we had a feature called Document Compression. This allowed a user to save significant amount of space by having the documents stored in a compressed form on disk. If you had large documents, you could typically see significant space savings from enabling this feature. With RavenDB 4.0, we removed it completely. The reason is that we need to store documents in a way that allow us to load them and work with them in their raw form without any additional work. This is key for many optimizations that apply to RavenDB 4.0.

However, that doesn’t mean that we gave up on compression entirely. Instead of compressing the whole document, which would require us to decompress any time that we wanted to do something to it, we selectively compress individual fields. Typically, large documents are large because they have either a few very large fields or a collection that contain many items. The blittable format used by RavenDB handles this in two ways. First, we don’t need to repeat field names every time, we store this once per document and we can compress large field values on the fly.

Take this blog for instance, a lot of the data inside it is actually stored in large text fields (blog posts, comments, etc). That means that when stored in RavenDB 4.0, we can take advantage of the field compression and reduce the amount of space we use. At the same time, because we are only compressing selected fields, it means that we can still work with the document natively. A trivial example would be to pull the recent blog post titles. we can fetch just these values (and since they are pretty small already, they wouldn’t be compressed) directly, and not have to touch the large text field that is the actual post contents.

Here is what this looks like in RavenDB 4.0 when I’m looking at the internal storage breakdown for all documents.

image

Even though I have been writing for over a decade, I don’t have enough posts yet to make a statistically meaningful difference, the total database sizes for both are 128MB.

Public Service AnnouncementConcurrentDictionary.Count is locking

time to read 2 min | 233 words

During a performance problem investigation, we discovered that the following innocent looking code was killing our performance.

This is part of a code that allow users to subscribe to changes in the database using a WebSocket. This is pretty rare, so we check that there aren’t any and skip all the work.

We had a bunch of code that run on many threads and ended up calling this method. Since there are not subscribers, this should be very cheap, but it wasn’t. The problem was that the call to Count was locking, and that created a convoy that killed our performance.

We did some trawling through our code base and through the framework code and came up with the following conclusions.

ConcurrentQueue:

  • Clear locks
  • CopyTo, GetEnumerator, ToArray, Count creates a snapshot (consume more memory)
  • TryPeek, IsEmpty are cheap

Here is a piece of problematic code, we are trying to keep the last ~25 items that we looked at:

The problem is that this kind of code ensures that there will be a lot of snapshots and increases memory utilization.

ConcurrentDictionary:

  • Count, Clear, IsEmpty, ToArray  - lock the entire thing
  • TryAdd, TryUpdate, TryRemove – lock the bucket for this entry
  • GetEnumerator does not lock
  • Keys, Values both lock the table and forces a temporary collection

If you need to iterate over the keys of a concurrent dictionary, there are two options:

Iterating over the entire dictionary is much better then iterating over just the keys.

All I asked was Hello World

time to read 2 min | 262 words

When running a full cluster on a single machine, you end up with a lot of console windows running instances of RavenDB, and it can be a bit hard to identify which is which.

We had the same issue when we have multiple tabs, it takes effort to map urls to the specific node, we solved this particular problem by making sure that this is very explicit in the browser.

image

And that turned out to be a great feature.

So I created an issue to also print the node id in the console, so we can quickly match a console window to the node id that we want. The task is literally to write something like:

Console.WriteLine(server.Cluster.NodeId);

And the only reason I created an issue for that is that I didn’t want to be side tracked by figuring out where to put this line.

Instead of a one line change, I got this:

image

Now, here is the difference between a drive by line of code and an actual resolution. Here is what this looks like:

image

And this handles nodes that haven’t been assigned and ID yet and color the nodes differently based on their topology so they can be easily be told apart. It also make the actual important information quite visible at a glance.

Practice makes perfect

time to read 3 min | 499 words

I run into this over twitter:

image

There were some suggestions there to go to meetups, find a mentor, etc. Those are important, but I consider them secondary to what you need to be a good developer.

My advice:

Write code, you'll likely write crap code, but write code, and a lot of it.

Read code, you'll not understand some, but try to.

The order matter.

The only way to be a good developer is to be a bad developer first. I have a drawer full of old hard disks that contain old code, some of it goes back over 20 years. I still remember being incredibly proud in writing a full BBS system in VBScript & ASP (classic!) that didn’t use a database but rather manipulated the HTML files on disk directly so you had what is effectively a static website that would self modify itself. The impressive thing was this was a single nested switch statement that went on for thousands of lines. I somehow managed to keep it all in my head enough to be able to actually complete the project.

It would never work in practice (I didn’t have any concept of “what happens if two requests happen at the same time”) and it was never deployed, but it was code that I wrote, and that thought me what works. More importantly, it told me what doesn’t work. That meant reading errors, figuring out how to find faults in my program, getting used to run <----> modify cycle, etc.

I wrote web systems, gesture recognition systems that would serve as hot keys in Windows, shell extensions and a lot of random stuff. Most of it was never meant to be anything, it was just a way for me to explore. The more I wrote, the more I knew what was going on.

At that point, reading other people’s code would have done nothing for me. I wasn’t at a level that I could grasp what other people were doing. It took a long time until I was ready to actually peek into other people’s code and actually be able to make sense of it. More to the point, it took a long time until I was able to actually learn something from that, rather then just go with a targeted “what do I need to make X work”.

Having other people there to help can be very useful, but it can also be a crutch. At least initially, you need to fall down a lot to figure things out. Mostly because people have very hard time telling you how they found the problem in your code. “It’s obvious that this is here” doesn’t give you much to learn from except possibly that you are stupid for missing the obvious. A lot of the advice that this tweet got is absolutely something that I can get behind, but I would put it significantly later in the process.

Keeping track on long running branches

time to read 2 min | 387 words

imageI talked about the Merge Games in somewhat of a jest, but more seriously, there is a lot to worry about once you have long running branches. In our case, it isn’t so much that we have a lot of long running branches as we have a ton of changes that are happening in multiple branches in parallel and it sometimes can take a few weeks until the work is done and we can merge it all.

This put a lot o pressure on the code review part of the process. One of the things that I really like with GitHub is the PR / review processes, and it works great when you have a small commits / PRs. The problem is that when you are talking about large scope of work, you are left with few options for proper review.

One option is to get a PR with dozens of commits, and having to slog through each of them to understand what is going on. Another is to get a PR with a single commit, that contains a lot of changes. This means that you have to grasp the whole change in one shot. Either option is really hard, and can lead the reviewer to skim through the code.  That isn’t something that we want to do, instead, we really want to pay as much attention to the code as we did while writing it.

My process for handling this is to lean heavily on GitHub. What I do is create a PR very early in the process, sometimes immediately after the first commit in that branch. That gives me the ability to review things incrementally. Instead of having to deal with it all at once, I can review the changes as they come in. Whenever one of the developers push their commits, I’ll get a notification about be able to go over the details and comment on the spot.

That shortens the feedback cycle and remove a lot of the complexity from the review process. It also means that we can more easily note that one developer is doing something that is also being done by another team, so we can integrate the work earlier in the process.

FUTURE POSTS

  1. PR Review: Code has cost, justify it - 15 hours from now
  2. PR Review: Beware the things you can’t see - 4 days from now
  3. The right thing and what the user expect to happen are completely unrelated - 5 days from now
  4. PR Review: It’s the error handling, again - 6 days from now

There are posts all the way to Oct 25, 2017

RECENT SERIES

  1. PR Review (7):
    10 Aug 2017 - Errors, errors and more errors
  2. RavenDB 4.0 (15):
    13 Oct 2017 - Interlocked distributed operations
  3. re (21):
    10 Oct 2017 - Entity Framework Core performance tuning–Part III
  4. RavenDB 4.0 Unsung Heroes (5):
    05 Oct 2017 - The design of the security error flow
  5. Writing SSL Proxy (2):
    27 Sep 2017 - Part II, delegating authentication
View all series

RECENT COMMENTS

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats