Ayende @ Rahien

Oren Eini aka Ayende Rahien CEO of Hibernating Rhinos LTD, which develops RavenDB, a NoSQL Open Source Document Database.

You can reach me by:

oren@ravendb.net

+972 52-548-6969

Posts: 7,127 | Comments: 50,009

Privacy Policy Terms
filter by tags archive
time to read 2 min | 225 words

I have a consultant that did some work for me. While the majority of our people are working in Israel and Poland, we actually have people working for us all over the map.

The consultant submitted their invoice at the end of the month and we sent a wire transfer to the provided account. So far, pretty normal and business as usual. We do double verification of account details, to avoid common scams, by the way, so we know that the details we sent were correct. Except… the money never arrived.

When we inquired, it turned out that the money transfer was reversed. The reason why? This is the address that the consultant provided (not the actual address, mind, but has the same issue). Note that the

Mr. Great Consultant
1234 Cuba Avenue
Alta Vista, Ottawa, K1G 1L7
Canada

The wire transfer was flagged as potential international sanctions violation and refused.

That was… very strange.

It appears that someone saw Cuba in the address, decided that this is a problem and refuse the transfer.  I’m not sure if I would rather that this is the case of an over active Regex or a human not applying critical thinking.

We are now on week two of trying to resolve that with the bank and it is quite annoying.

Next port of call, buying Monero on the dark web… Smile.

time to read 2 min | 371 words

I posed a potential problem for a job interview. Given the following function, generate another key that has the same shard:

In other words, give “users/71” and a prefix of “orders/654”, generate a key that would be placed on the same shard as “users/71”. The answer in this case can be: “orders/654-vaueaa”.

In order to answer the question, we need to understand what is going on here. The function above is a fancy way to extract 16 bits of information from the key using a cryptographically hash function. MD5 is no longer considered secured, but given the fact that I’m needing just 16 bits, that is not an issue. The code above is slightly more complex than needed, I could simplify it to this and have the same effect (but not the same result, mind):

The need to generate a matching shard id is another way to say that we need a hash collision. Given that the key space is 2^16, and that we can assume that any mutation to the key will result effectively random changes to the result, we can simply generate different keys and try to see if they match. Here is a simple way to do so:

We are effectively throwing a dice and seeing if this match. So it is a probability game to wait until we have a collision. The actual implementation isn’t that important, what is interesting is to talk about the implications here:

  • Are there better ways to go about doing something like this? Not really, given that MD5 isn’t that broken.
  • How much time will it take to generate a shard id match? The answer, usually around 64K tries. But why is interesting. The birthday attack issues don’t play here, because we don’t need to match to multiple items, just one. So we role the dice and see if we match on the value.
  • Can we speed this up? Using a different hash function would probably help, yes.
  • What other ways do we have to handle this? Different shard id generation would allow much better alternative.

The last question is where we get into more interesting details about system design, ergonomics of the choices we make and get to see how the candidate actually thinks.

time to read 1 min | 118 words

Here is something that came up that I liked as task to handle in a job interview. We have the following function, generating a shard id from a document key:

There is a need to create a new document that must reside on the same shard as another document. The task is to write a function that would generate that id.

For example:

  • Key to match: users/71
  • Document id to modify: orders/654

The result can be: orders/654-22261 or orders/654-vaueaa

In other words, both orders/654-vaueaa and users/71 will have the same shard.

I like this because it is very easy to explain and has simple coding requirements. But it allows to go deep into several different fields of knowledge to understand the candidate’s solution.

time to read 3 min | 523 words

A RavenDB user called us with a very strange issue. They are running on RavenDB 3.5 and have run out of disk space. That is expected, since they are storing a lot of data. Instead of simply increasing the disk size, they decided to take the time and simply increase the machine overall capacity. They moved from a 16 cores machine to a 48 cores machine with a much larger disk.

After the move, they found out something worrying. RavenDB now used a lot more CPU. If previous the average load was around 60% CPU utilization, now they were looking at 100% utilization on a much more powerful machine. That didn’t make sense to us, so we set out to figure out what was going on. A couple of mini dumps and we were able to figure out what was going on.

It got really strange because there was the following interesting observation:

  • Under minimal load / idle – no CPU at all
  • Under high load – CPU utilization in the 40%
  • Under medium load – CPU utilization at 100%

That was strange. When there isn’t enough load, we are at a 100%? What gives?

The culprit was simple: BlockingCollection.

“Huh”, I can hear you say. “How can that be?”

A BlockingCollection should not be the cause of high CPU, right? It is in the name, it is blocking. Here is what happened. That blocking collection is used to manage tasks, and by default we are spawning threads to handle that at twice the number of available cores. All of these threads are sitting in a loop, calling Take() on the blocking collection.

The blocking collection internally is implemented as using a SemaphoreSlim, which call Wait() and Release() on the values as needed. Here is the Release() method notifying waiters:

image

What you can see is that if we have more than a single waiter, we’ll update all of them. The system in question had 48 cores, so we had 96 threads waiting for work. When we add an item to the collection, all of them will wake and try to pull an item from the collection. Once of them will succeed, and then rest will not.

Here is the relevant code:

image

As you can imagine, that means that we have 96 threads waking up and spending a full cycle just spinning. That is the cause of our high CPU.

If we have a lot of work, then the threads are busy actually doing work, but if there is just enough work to wake the threads, but not enough to give them something to do, they’ll set forth to see how hot they can make the server room.

The fix was to reduce the number of threads waiting on this queue to a more reasonable number.

The actual problem was fixed in .NET Core, where the SemaphoreSlim will only wake as many threads as it has items to free, which will avoid the spin storm that this code generates.

time to read 2 min | 254 words

We run a lot of benchmarks internally and sometimes it feels like there is a roaming band of performance focused optimizers that go through the office and try to find under utilized machines. Some people mine bitcoin for fun, in our office, we benchmark RavenDB and try to see if we can either break a record or break RavenDB.

Recently a new machine was… repurposed to serve as a benchmarking server. You can call it a right of passage for most new machines here, I would say. The problem with that machine is that the client would error. Not only would it fail, but at the exact same interval. We tested that from multiple clients and from multiple machines and found that every 30 minutes on the dot, we’ll have an outage that lasted under one second.

Today I come to the office to news that the problem was found:

image

It seems that after 30 minutes of idle time (no user logged in), the machine would turn off the ethernet, regardless of if there are active connections going on. Shortly afterward it would be woken up, of course, but it would be down just enough time for us to notice it.

In fact, I’m really happy that we got an error. I would hate to try to figure out latency spikes because of something like this, and I still don’t know how the team found the root cause.

time to read 2 min | 252 words

In my previous post, I asked you to find the bug in the following code:

This code looks okay, at a glance, but it turns out that this is a really nasty data corruption bug waiting to happen. Here is what the problematic usage looks like:

Do you see the error now?

If the operation will time out, an exception will be raised, but the underlying operation isn’t over. We are using a shared pool, so the buffer we use may be handed over to someone else. At this point, we do something with the buffer, but the pending I/O operation will read data into this buffer, meaning that this is probably going to be garbage in it when we actually use it.

To actually happen, you need to have a timeout operation, reuse of the buffer and the I/O operation completing at just the wrong time. So a sequence of highly unlikely events that would assuredly happen within an hour of pushing something like that to production. For fun, this will reliably happen the moment you have some network issues. So imagine that you have a slow node, which then cause memory corruption, which end up being a visible bug (instead of maybe aborted request) very rarely, and with no indication on how this happened.

How do you fix this? Like this:

This will use a cancellation token, which will cause the operation to be aborted at the stream level, meaning that we can safely reuse values that we passed the underlying stream.

time to read 2 min | 400 words

Hadi’s had an interesting Tweet:

This sounds reasonable on the surface, but it runs into hard issues with the vast differences between any amount of money vs. free. There have been quite a few studies on this. A good reading on the subject can be found here. Take a note of the Amazon experience. Shipping that is free increases sales. Shipping that is 0.2$ does not increase sales. Note that from a practical standpoint, there is no real difference between the prices, but it matters, quite a lot.

Leaving aside the psychology of free, there are also other issues. Let’s say that there is a tool that can save someone on my team 10 minutes a day, it is priced at 1$ / year. By any measure you care to name, that is more than worth it.

But the process of actually getting this purchased is likely to be more expensive than the actual cost of the tool. In my case, the process is “go talk to Oren”. In other environments, that may involve “talk to your boss, that will submit it to accounts payable, which will pay it in 60 days”.

And at that point, charging 1$ just doesn’t matter. You could charge 50$, and the “cost” for the person making the purchase would be pretty much the same. Note that this is for corporate environment, the situation is (slightly) different for sales directly to consumers, but not significantly so. Free stills trumps everything.

When talking about freemium models, you hear quotes that are bad. For example, Evernote had a < 2% conversion rate, and that is a wildly successful example. Dropbox has a rate of about 4%, and the average seems to be 1%. And that is for businesses who are focused on optimizing the freemium funnel. That takes a lot of time and effort, mind you.

I don’t think that there is a practical option for turning this around, and I say that as someone who would love it if that were at all possible.

time to read 2 min | 245 words

I’ll be writing a lot more about our RavenDB C++ client, but today I was reviewing some code and I got a reply that made me go: “Ohhhhh! Nice”, and I just had to blog about it.

image

This is pretty much a direct transaction of how you’ll write this kind of query in C#, and the output of this is a RQL query that looks like this:

image

The problem is that I know how the C# version works. It uses Reflection to extract the field names from the type, so we can figure out what fields you are interested in. In C++, you don’t have Reflection, so how can this possibly work?

What Alexander did was really nice. Given that the user already have to provide us with the serialization routine for this type (so we can turn the JSON into the types that will be returned). Inside the select_fields() call, he constructed an empty object, serialize that and then use the field names in the resulting JSON to figure out what fields we want to project from the Users documents.

It make perfect sense, it require no additional work from the user and it gives us consistent API. It is also something that I would probably never think to do.

FUTURE POSTS

No future posts left, oh my!

RECENT SERIES

  1. Building a social media platform without going bankrupt (10):
    05 Feb 2021 - Part X–Optimizing for whales
  2. Webinar recording (12):
    15 Jan 2021 - Filtered Replication in RavenDB
  3. Production postmortem (30):
    07 Jan 2021 - The file system limitation
  4. Open Source & Money (2):
    19 Nov 2020 - Part II
  5. re (27):
    27 Oct 2020 - Investigating query performance issue in RavenDB
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats