Ayende @ Rahien

Oren Eini aka Ayende Rahien CEO of Hibernating Rhinos LTD, which develops RavenDB, a NoSQL Open Source Document Database.

You can reach me by:

oren@ravendb.net

+972 52-548-6969

Posts: 6,990 | Comments: 49,635

filter by tags archive
time to read 2 min | 303 words

The coronavirus is a global epidemic and has had impact on the entire world. It has shaken the foundations of our society and has made things that would seem like bad science fiction a reality. The fact that most of the world is now under some form of quarantine is something that I would expect to see in a disaster movie, not in real life. And if we are in a disaster movie, I would like to jump over to the next scene please, and to formally submit a protest to the writers’ union.

Like everyone else, we have had to adjust to a very different mode of operations. Hibernating Rhinos is a distributed company, in the sense that we have teams working in different countries and continents. However, the majority of our people work in one of two offices (in Israel and in Poland). We have recently moved to a brand new office space which I’m incredibly proud of, which now sits empty. Given the needs of the current times, we have shifts to fully remote work across the board. Beyond the upheaval of normal life, we see that many customers and users are facing the same challenges as we do.

For that purpose, I have decided to offer all RavenDB customers two months of free support, to help navigate the challenges ahead. A lot of organizations are scrambling, because the usual workflow is interrupted and things aren’t as they used to. In these trying time, we want to make it as simple as possible for you to make use of RavenDB. 

This offer applies to support for RavenDB on your own machines as well as RavenDB Cloud.

With the hope that we would look back at this as we now look at Y2K, I would like to close by urging you to be safe.

time to read 4 min | 654 words

A user reported that a particular query returned the results in an unexpected order. The query in question looked something like the following:

image

Note that we first search by score(), and then by the amount of sales. The problem was that documents that should have had the same score were sorted in different locations.

Running the query, we would get:

image

But all the documents have the same data (so they should have the same score), and when sorting by descending sales, 2.4 million should be higher than 62 thousands. What is going on?

We looked at the score, here are the values for the part where we see the difference:

  • 1.702953815460205
  • 1.7029536962509155

Okay… that is interesting. You might notice that the query above has include explanations(), which will give you the details of why we have sorted the data as we did. The problem? Here is what we see:

image

I’m only showing a small piece, but the values are identical on both documents. We managed to reduce the issue to a smaller data set (few dozen documents, instead of tens of thousands), but the actual issue was a mystery.

We had to dig into Lucene to figure out how the score is computed. In the land of indirectness and virtual method calls, we ended up tracing the score computation for those two documents and figure out the following, here is how Lucene is computing the score:

image

They sum the various scoring to get the right value (highly simplified). But I checked, the data is the same for all of the documents. Why do we get different values? Let’s see things in more details, shall we?

image

Here is the deal, if we add all of those together in calculator, we’ll get: 1.702953756

This is close, but not quite what we get from the score. This is probably because calculator does arbitrary precision numbers, and we use floats. The problem is, all of the documents in the query has the exact same numbers, why do we get different values.

I then tried to figure out what was going on there. The way Lucene handle the values, each subsection of the scoring (part of the graph above) is computed on its own and them summed. Still doesn’t explain what is going on, then I realized that Lucene is using a heap of mutable values to store the scorers at it was scoring the values. So whenever we scored a document, we will mark the scorer as a match and then put it in the end of the heap. But the order of the items in the heap is not guaranteed.

Usually, this doesn’t matter, but please look at the above values and consider the following fact:

What do you think are the values of C and D in the code above?

  • c = 1.4082127
  • d = 1.4082128

Yes, the order of operations for addition matters a lot for floats. This is expected, because of the way floating points are represented in memory, but unexpected.

The fact that the order of operations on the Lucene scorer is not consistent means that you may get some very subtle bugs. In order to avoid reproducing this bug, you can do pretty much anything and it will go away. It requires very careful setup and is incredibly delicate. And yet it tripped me hard enough to make me lose most of a day trying to figure out exactly where we did wrong.

Really annoying.

time to read 2 min | 219 words

I announced the beta availability of RavenDB 5.0 last week, but I missed up on some of the details on how to enable that. In this post, I’ll give you detailed guide on how to setup RavenDB 5.0 for your use right now.

For the client libraries, you can use the MyGet link, at this time, you can run:

Install-Package RavenDB.Client -Version 5.0.0-nightly-20200321-0645 -Source https://www.myget.org/F/ravendb/api/v3/index.json

If you want to run RavenDB on your machine, you can download from the downloads page, click on the Nightly tab and select the 5.0 version:

image

And on the cloud, you can register a (free) account and then, add a product:

image

Create a free instance:

image

Select the 5.0 release channel:

image

And then create the RavenDB instance.

Wait a few minutes, and then you can connect to your RavenDB 5.0 instance and start working with the new features.

You can also run it with Docker using:

docker pull ravendb/ravendb-nightly:5.0-ubuntu-latest

time to read 3 min | 523 words

A RavenDB user called us with a very strange issue. They are running on RavenDB 3.5 and have run out of disk space. That is expected, since they are storing a lot of data. Instead of simply increasing the disk size, they decided to take the time and simply increase the machine overall capacity. They moved from a 16 cores machine to a 48 cores machine with a much larger disk.

After the move, they found out something worrying. RavenDB now used a lot more CPU. If previous the average load was around 60% CPU utilization, now they were looking at 100% utilization on a much more powerful machine. That didn’t make sense to us, so we set out to figure out what was going on. A couple of mini dumps and we were able to figure out what was going on.

It got really strange because there was the following interesting observation:

  • Under minimal load / idle – no CPU at all
  • Under high load – CPU utilization in the 40%
  • Under medium load – CPU utilization at 100%

That was strange. When there isn’t enough load, we are at a 100%? What gives?

The culprit was simple: BlockingCollection.

“Huh”, I can hear you say. “How can that be?”

A BlockingCollection should not be the cause of high CPU, right? It is in the name, it is blocking. Here is what happened. That blocking collection is used to manage tasks, and by default we are spawning threads to handle that at twice the number of available cores. All of these threads are sitting in a loop, calling Take() on the blocking collection.

The blocking collection internally is implemented as using a SemaphoreSlim, which call Wait() and Release() on the values as needed. Here is the Release() method notifying waiters:

image

What you can see is that if we have more than a single waiter, we’ll update all of them. The system in question had 48 cores, so we had 96 threads waiting for work. When we add an item to the collection, all of them will wake and try to pull an item from the collection. Once of them will succeed, and then rest will not.

Here is the relevant code:

image

As you can imagine, that means that we have 96 threads waking up and spending a full cycle just spinning. That is the cause of our high CPU.

If we have a lot of work, then the threads are busy actually doing work, but if there is just enough work to wake the threads, but not enough to give them something to do, they’ll set forth to see how hot they can make the server room.

The fix was to reduce the number of threads waiting on this queue to a more reasonable number.

The actual problem was fixed in .NET Core, where the SemaphoreSlim will only wake as many threads as it has items to free, which will avoid the spin storm that this code generates.

time to read 3 min | 454 words

I can a lot about the performance of RavenDB, a you might have noticed from this blog. A few years ago we had a major architecture shift that increased our performance by a factor of ten, which was awesome. But with the meal, you get appetite, and we are always looking for better performance.

One of the things we did with RavenDB is build things so we’ll have the seams in place to change the internal behavior without users noticing how things are working behind the scenes. We have used this capability a bunch of time to improve the performance of RavenDB. This post if about one such scenario that we recently implemented and will go into RavenDB 5.0.

Consider the following query:

image

As you can see, we are doing a range based query on a date field. Now, the source collection in this case has just over 5.9 million entries and there are a lot of unique elements in the specified range. Let’s consider how RavenDB will handle this query in version 4.2:

  • First, find all the unique CreatedAt values between those ranges (there can be tens to hundreds of thousands).
  • Then, for each one of those unique values, find all the match documents (usually, only one).

This is expensive and the problem almost always shows up when doing date range queries over non trivial ranges because that combine the two elements of many unique terms and very few results per term.

The general recommendation was to avoid running the query above and instead use:

image

This allows RavenDB to use a different method for range query, based on numeric values, not distinct string values. The performance different is huge.

But the second query is ugly and far less readable. I don’t like such a solution, even if it can serve as a temporary workaround. Because of that, we implemented a better system in RavenDB 5.0. Behind the scenes, RavenDB now translate the first query into the second one. You don’t have to do anything to make it happen (when migrating from 4.2 instances, you’ll need to reset the index to get this behavior). You just use dates as you would normally expect them to be used and RavenDB will do the right thing and optimize it for you.

To give you a sense of the different in performance, the query above on a data set of 5.9 million records will have the following performance:

  • RavenDB 4.2 - 7,523 ms
  • RavenDB 5.0 –    134 ms

As you can imagine, I’m really happy about this kind of performance boost.

time to read 1 min | 99 words

RavenDB 5.0 release train is gathering steam as we speak. The most recent change to talk about is the fact that you can now deploy RavenDB 5.0 beta in RavenDB Cloud:

image

This allows you to start experimenting with the newest features of RavenDB, including the time series capabilities.

Please take a look, the RavenDB 5.0 option is available at the free tier as well, so I would encourage you to give it a run.

As always, your feedback is desired and welcome.

time to read 2 min | 303 words

Recently the time.gov site had a complete makeover, which I love. I don’t really have much to do with time in the US in the normal course of things, but this site has a really interesting feature that I love.

Here is what this shows on my machine:

image

I love this feature because it showcase a real world problem very easily. Time is hard. The concept we have in our head about time is completely wrong in many cases. And that leads to interesting bugs. In this case, the second machine will be adjusted on midnight from the network and the clock drift will be fixed (hopefully).

What will happen to any code that runs when this happens? As far as it is concerned, time will move back.

RavenDB has a feature, document expiration. You can set a time for a document to go away. We had a bug which caused us to read the entries to be deleted at time T and then delete the documents that are older than T. Expect that in this case, the T wasn’t the same. We travelled back in time (and the log was confusing) and go an earlier result. That meant that we removed the expiration entries but not their related documents. When the time moved forward enough again to have those documents expire, the expiration record was already gone.

As far as RavenDB was concerned, the documents were updated to expire in the future, so the expiration records were no longer relevant. And the documents never expired, ouch.

We fixed that by remembering the original time we read the expiration records. I’m comforted with knowing that we aren’t the only one having to deal with it.

FUTURE POSTS

No future posts left, oh my!

RECENT SERIES

  1. Production postmortem (29):
    23 Mar 2020 - high CPU when there is little work to be done
  2. RavenDB 5.0 (3):
    20 Mar 2020 - Optimizing date range queries
  3. Webinar (2):
    15 Jan 2020 - RavenDB’s unique features
  4. Challenges (2):
    03 Jan 2020 - Spot the bug in the stream–answer
  5. Challenge (55):
    02 Jan 2020 - Spot the bug in the stream
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats