Ayende @ Rahien

Sleep as currency? I would buy some...

Boldly & confidently fail, it is better than the alternative

Recently I had the chance to sit with a couple of the devs in the RavenDB Core Team to discuss “keep & discard” habits*.

The major problem we now have with RavenDB is that it is big. And there are a lot of things going on there that you need to understand. I run the numbers, and it turns out that the current RavenDB contains:

  • 835,000 Lines of C#
  •   67,500 Lines of Type Script
  •   87,500 Lines of HTML

That is divided into many areas of functionalities, but that is still a big chunk of stuff to go through. And that is ignoring things that require we understand additional components (like Esent, Lucene, etc). What is more, there is a lot of expertise in understanding what is going on in term of the full picture. We limit this value here because too much of it would result in high memory consumption under this set of circumstances, for example.

The problem is that it take time, and sometime a lot of it, to get good understanding on how things are all coming together. In order to handle that, we typically assign new devs issues from all around the code base. The idea isn’t so much to give them a chance to become expert in a particular field, but to make sure that they get the general idea of how come is structured and how the project comes together.

Over time, people tend to gravitate toward a particular area (M** is usually the one handling the SQL Replication stuff, for example), but that isn’t fixed (T fixed the most recent issue there), and the areas of responsibility shifts (M is doing a big task, we don’t want to disturb him, let H do that).

Anyway, back to the discussion that we had. What I realized is that we have a problem. Most of our work is either new features or fixing issues. That means that nearly all the time, we don’t really have any fixed template to give developers “here is how you do this”. A recent example was an issue where invoking smuggler with a particular set of filters would result in very high cost. The task was to figure out why, and then fix this. But the next task for this developer is to do sharded bulk insert implementation.

I’m mentioning this to explain a part of the problem. We don’t see a lot of “exactly the same as before” and a new dev on the team lean on the other members quite heavily initially. That is expected, of course, and encouraged. But we identified a key problem in the process. Because the other team members also don’t have a ready made answer, they need to dig into the problem before they can offer assistance, which sometimes (all too often, to be honest) lead to a “can you slide the keyboard my way?” and taking over the hunt. The result is that the new dev does learn, but a key part of the process is missing, the finding out what is going on.

We are going to ask both sides of this interaction to keep track of that, and stop it as soon as they realize that this is what is going on.

The other issue that was raised was the issue of fear. RavenDB is a big system, and it can be quite complex. It is quite reasonable apprehension, what if I break something by mistake?

Here it comes back to the price of failure. Trying something out means that at worst you wasted a work day, nothing else. We are pretty confident in our QA process and system, so we can allow people to experiment. Analysis paralysis is a much bigger problem. And I wasn’t being quite right, trying the wrong thing isn’t wasting a day, you learned what doesn’t work, and hopefully also why.

“I have not failed. I've just found 10,000 ways that won't work.”
Thomas A. Edison

* Keep & discard is a literal translation of a term that is very common in the IDF. After most activities, there is an investigation performed, and one of the first questions asked is what we want to keep (good things that happened that we need to preserve for the next time we do this) and what we need to discard (bad things that we need to watch out for).

** The actual people are not relevant for this post, so I’m using letters only.

Tags:

Published at

Originally posted at

Comments (3)

My view on crowd funding

After my previous post, I was asked what I’m thinking about the notion of crowd funding, which is currently all the rage.

The answer is complicated. I’m focusing right now on things like kick starter and its siblings, because I’m familiar with how they work. The basic premise is pretty great. You have some idea (usually a product) that require initial capital and has some well known market. By directly contacting the target audience, we can get the seed money, judge demand and have very low risk overall. The “investors” put in small amount of money, which loss they can tolerate without hardship. The project get money for very little effort and get great marketing along the way.

This is great, if you are doing a product. Something that can be sold. For instance, let us say that we want to do a major feature, like adding time series capabilities to RavenDB. Let us say that we start a kick starter campaign for this, asking for 150,000 USD and promising backers that they’ll get a free license out of early sponsorship.

I’ll get into the exact costs associated with this option in a bit. But before we go there, remember the premise of my previous post. It isn’t money to build a specific product. It is money that is required to purchase something for the business itself. Of course, buying that cool car will raise morale and I have a spreadsheet that says that it will increase the effectiveness of the team by 17.4% (although it will decrease parking space by 37%). So it make sense to go with that, from a business perspective. However, there is very little that I can do to actually make people want to back “we want a cool car” notion. At least, I don’t think so, but the internet does have some dark corners.

Back to the notion of using this to build products. There is a very basic problem here. RavenDB isn’t targeting individuals. It is a database platform, and most of our customers are businesses or enterprises. That lead to a very different mindset. Speculative investment in something like this is going to be much rarer, harder and fraught with issues. An Open Source project can do that, but it make sense to invest in a project a business is using, but there are very few who actually manage to do that. A quick search of kick starter doesn’t show any major open source soliciting funds there.

Kick starter make sense for personal stuff, things that you actually get to hold, or need to buy. Something of some scarcity. Doing this for commercial software make very little sense, and for open source, it is even a bigger problem. For open source projects that depend on donations, usually you have a valid commercial reason for people to donate (Linux, Wikipedia, etc).

I’m open for contrarian point of view, mind. But I don’t think that crowd funding is applicable for the kind of things that I would want to use it for.

Funding options

This is a divergence from my usual discussion on technical stuff. In this post, I want to talk about money. In particular, how you get it from other people. Note that I am neither an expert nor qualified to talk about the subject matter, this post came out of a lot of scribbled notes and is mostly meant to serve as a way to lay down a line of thought. All numbers are made up, and while I would like such a car, it would be mostly to inflict it on the employee of the month.

There are many cases in the lifecycle of a business where you need more cash than you currently have (or are willing to spend outright).

A common scenario is when you start a business, or when you want to expand it. For our discussion, we’ll use the example of the following drool worthy car:

I consider such a piece of art priceless, but let us say that I managed to convince the owner to sell it to me for the nice sum of 1,000,000$.

Unfortunately, I don’t have 1,000,000$. I only have 650,000$. So long, beautiful car, it was very nice to know you, but it is just not possible. Except that there is this thing where people give you a lump sum of money, and you give it back over time (although usually more than you got).

Funding is important for businesses in the same manner that breathing is for people. There are typically several ways to fund a business:

  • Direct cash infusion – That is usually how most business start. The amount of money put into the business depend on what it needs to do. A web developer would need the money buy a laptop and a Starbucks loyalty card, so that is easy. For a restaurant, you need enough money for rent, employees, equipment, etc. The smaller the amount you need to put into the business to kick start it, the easier it is to just use your own saving to do so.
  • Partners – This is pretty much the same as the previous one, but instead of having only one person do that, you have multiple people and more savings to dip into.
  • Angels/Investors – Those are people who for various reasons would give you money. Sometimes this is because they are related to you, but often time it is a calculated move, investing some money in a business in order to get a stake in it and cash it in afterward.
  • Government development loan / grant – Sometimes you can get this, and they usually have both very good terms, and really strict rules, regulation and hops to jump through.
  • Bank / credit loan – Well, you are presumably familiar with that. You get a loan, pay interest, mortgage some assets, etc.
  • Self funded – Your business is making more money than it is spending, therefor you have money to spend on the business.

The best choice is self funding, because that mean that you are profitable and aren’t beholden to someone. The other really depend on personal preferences. Here are mine:

  • Direct cash infusion – That works for starting a business with low starting overhead costs (see, single developer shop). It might also be viable if you have a lot of personal wealth that you can put into the business, but personally, I like to think about the money flowing in the other direction. Otherwise that is an indication that there is something strange going on here.
  • Partners – I used to work at a place that was owned by 7 founding members + 1 “silent partner”. I still remember when the entire company got an email from a co-CEO that was basically: “You are forbidden to discuss project X or anything related to it with the other co-CEO”. That left an… impression, shall we say. Also, this is again something that you would usually do in the beginning. Bringing a partner into an existing business implies one of a few things. You are in a big trouble (either personally or the business) and need cash infusion that you can’t/won’t supply or you are doing really well and people are flocking to join you.
  • Investors/Angels – This is very similar to the previous point, with the caveat that investors usually aren’t going to meddle in the day to day affairs, nor are they going to shoulder any burden. They are there to provide the money, some expertise/networking but that is basically it. They do create a pretty huge amount of bureaucracy, reports, compliance, etc. The investors needs to know that you aren’t blowing away their money, after all.
  • Government development loan / grant – This is pretty much the same as the previous one, only the investor is the government. If you thought that investors generated a lot of paperwork, you were mistaken.

The remaining two options are self funding and getting a loan. Now, assuming that no one else buy this magnificent car, I can put some numbers in Excel and predict that in a couple of years, I’ll have enough money to buy it outright. So all I need to do is ask the owner to not sell it to anyone, hope that my cash flow remain according to projections, hope the price doesn’t change and just wait.

Of course, that means that I can’t crash lift moral by making this the official company car in the meantime. I’m losing quite a lot of amusing moments by waiting, and that is assuming that it is still possible in two years. Of course, if in two years I would have the money to do so, I’m not so sure that I would still want to just purchase it directly. That would mean having no money at all. And that is kinda of scary, because salaries need to be paid, and this car doesn’t look like it has good gas/mileage ratio.

So the option that we have left is taking a loan. The nice thing about doing that is that we can mortgage the actual asset that we are buying, this magnificent car. Now, the bank may not value it as much as I do, so they are going to give it a price of only 900,000$, and then they are going to only agree to fund 80% of that, which gives us 720,000$.

In other words, that means that we need to puny up 380,000$, which is much more reasonable, and leave us with a bit of free cash cushion. That lead to a few interesting observations:

  • The loan amount and the money we already have are comparable. That means that the bank is going to be much nicer to us than if we wanted to borrow much more money than we already have (on the assumption that if we got this amount of money once, we’ll be able to get it again to pay them).
  • There is a valid asset to mortgage, which reduce the loan risk (and thus get us better terms).
  • The current interest environment is at an all times low, which mean that this is a great time to loan money (and bad time to try to save).

This means that this is a much simpler deal than going to a bank with a business plan and hoping that they will believe that we can make it. Now, let us get down to the financial details.

An offer from bank A is for an interest rate of 4%. That gives us a month payment over ten years of 7,290$ per month.

An offer from bank B is for an interest rate of 4.25%. Which gives a monthly payment of 7,375$ per month.

That is a simple number game, and we are pretty much done at this point, right? Almost, but let us project this over 10 years, and see where that put us.

Bank A: Total amount of interest paid is 154,800$

Bank B: Total amount of interest paid is 165,000$

In other words, the total difference is 10,200$. That means that while it is still a numbers game, it isn’t just the interest rate. The reason is that we now need to consider a lot more aspects. For example, Bank B may have an easier loan approval process, or require less security, or value the car higher than bank A. Bank A doesn’t allow early cash out, while bank B does, or a million and one other differences.

The question now becomes is whatever the other stuff beyond the raw interest rate can be quantified, and whatever it is worth more than 10,000$.

As I said earlier in this post, this is mostly settling things in my mind. Feel free to ignore this post all together.

RavenDB 3.5 Features: Data Exploration

RavenDB is doing a pretty great job for being a production database, in fact, we have designed it upfront to only have features that make sense to have for robust production systems.

In particular, we don’t have any form of ad-hoc queries. A query always hits an index, so it is very fast. Even what we call dynamic queries in RavenDB are actually creating an index behind the scene. This is pretty awesome for normal production usage, but it does have some limitations when you want to explore the data. This can be because you are a developer trying to find a particular something, and you just want to quickly fire off random queries. You don’t care about the costs, and you don’t want to generate indexes. Or you can be an admin that needs to get a particular report from the system and you want to play around with the details until you get everything right.

In order to serve those needs, RavenDB 3.5 is going to have a really nice feature, explicit data exploration.

For example, let us say that I want to count the number of unique words in all of my posts, I can do it using the following:

image

Note that the actual query is pretty meaningless, and I’m writing this at 1AM with a baby nearby that make funny noises, so the Linq statement there works, but can probably be better.

The point here is that to demo what is going on. We write a simple Linq statement, and can run it against our database, and then gather the results back. It is like having LinqPad directly inside the RavenDB studio. In fact, that is the number one scenario that we envision for this feature, replacing LinqPad usage by having a native capability.

Now, some caveats. As you can see, you can select to limit the query duration as well as the number of documents it will operate on. That give us a quick way to explore the data without putting too much load on the server. You can even take the output here and throw it directly to Excel. “Sam, can you give the a breakdown of orders this year by month and country? Just email me the Excel spreadsheet”.

Note that this is intended as a user feature, it isn’t something that we provide an API for. It is there for admins or developers that are figuring things out, an admin feature, not something that you want to use on production.

Tags:

Published at

Originally posted at

Comments (4)

Work stealing in the presence of startup / shutdown costs

I mentioned that we have created our own thread pool implementation in RavenDB to handle our specific needs. A common scenario that ended up quite costly for us was the notion of parallelizing similar work.

For example, I have 15,000 documents to index .That means that we need to go over each of the documents and apply the indexing function. That is an embarrassingly parallel task. So that is quite easy. One easy way to do that would be to do something like this:

foreach(var doc in docsToIndex)
	ThreadPool.QueueUserWorkItem(()=> IndexFunc(new[]{doc}));

Of course, that generates 15,000 entries for the thread pool, but that is fine.

Except that there is an issue here, we need to do stuff to the result of the indexing. Namely, write them to the index. That means that even though we can parallelize the work, we still have non trivial amount of startup & shutdown costs. Just running the code like this would actually be much slower than running it in single threaded mode.

So, let us try a slightly better method:

foreach(var partition in docsToIndex.Partition(docsToIndex.Length / Environment.ProcessorCount))
	ThreadPool.QueueUserWorkItem(()=> IndexFunc(partition));

If my machine has 8 cores, then this will queue 8 tasks to the thread pool, each indexing just under 2,000 documents. Which is pretty much what we have been doing until now.

Except that this means that we have to incur the startup/shutdown costs a minimum of 8 times.

A better way is here:

ConcurrentQueue<ArraySegment<JsonDocument>> partitions = docsToIndex.Partition(docsToIndex.Length / Environment.ProcessorCount);
for(var i = 0; i < Environment.ProcessorCount; i++) 
{
	ThreadPool.QueueUserWorkItem(()=> {
		ArraySegment<JsonDocument> first;
		if(partitions.TryTake(out first) == false)
			return;

		IndexFunc(Pull(first, partitions));
	});
}

IEnumerable<JsonDocument> Pull(ArraySegment<JsonDocument> first, ConcurrentQueue<ArraySegment<JsonDocument>> partitions )
{
	while(true)
	{
		for(var i = 0; i < first.Count; i++)
			yield return first.Array[i+first.Start];

		if(partitions.TryTake(out first) == false)
			break;
	}
}

Now something interesting is going to happen, we are scheduling 8 tasks, as before, but instead of allocating 8 static partitions, we are saying that when you start running, you’ll get a partition of the data, which you’ll go ahead and process. When you are done with that, you’ll try to get a new partition, in the same context. So you don’t have to worry about new startup/shutdown costs.

Even more interesting, it is quite possible (and common) for those tasks to be done with by the time we end up executing some of them. (All the index is already done but we still have a task for it that didn’t get a chance to run.) In that case we exit early, and incur no costs.

The fun thing about this method is what happens under the load when you have multiple indexes running. In that case, we’ll be running this for each of the indexes. It is quite likely that each core will be running a single index. Some indexes are going to be faster than the others, and complete first, consuming all the documents that they were told to do. That means that the tasks belonging to those indexes will exit early, freeing those cores to run the code relevant for the slower indexes, which hasn’t completed yet.

This gives us dynamic resource allocation. The more costly indexes get to run on more cores, while we don’t have to pay the startup / shutdown costs for the fast indexes.

Fine grained work control

With RavenDB 3.5, we are focusing on performance as one of the key features. I’ve already spoken at length about the kind of changes that we had made to improve performance. A few percentage points here and there end up being quite significant when you add them all together.

But just sanding over the rough edges isn’t quite enough for us. We want to have a major impact, not just an avalanche of small improvements. In order to handle that, we needed to be much more aware of how we are making use of resources in the system.

The result of several months of work is now ready to enter into performance testing, and I’m quite excited about it. But before I show you the results, what is it? Well, RavenDB does quite a lot in the background, to avoid holding up a request thread when you are calling RavenDB. This mean that we have a lot of background work, indexing, map/reduce, etc.

We have been using the default .NET thread pool for a long time to do that, and it has served us very well. But it is a generic construct, without awareness of the unique needs that RavenDB has. Therefor, we worked for quite some time to create our own Thread Pool that match what we do.

The major changes with the RavenThreadPool (RTP from now) are:

  • There is a fixed (and dedicated) number of threads that will do the work, sharing (and stealing) work among themselves.
  • Indexes tasks are continuous and shared, so a big indexing work will spread across all threads, but with a preference for locality of work if we have a lot of stuff to parallel.
  • A slow index doesn’t stop us from working on other indexes, we’ll let it process on its own, and let the other indexes run forward without it.
  • Dynamic adjusting of the amount of work that is allowed for the indexes means that under load, we can dynamically and rapidly reduce the amount of work we are doing to allow more resources for processing requests.

There are other stuff, but they are mostly of interest for the people who work on RavenDB, not on those who use it.

And the results, they are pretty good. Here is the before and after sample.

image

Note that we have a mix here of various types of indexes. The X axis is time, and the Y axis is the number of documents indexed.

As you can see, in the before (without RTP), we are processing all indexes roughly on the same course, with a pretty flat growth over time.  However, with RTP, the situation is different. You can see that very quickly the fast indexes are starting to outpace both their version without RTP and the slower indexes.

That, in turn, means that they complete much faster. In the case of the Simple Map index, it complete indexing roughly 50% faster than without RTP. But even more interesting is what happens globally, because we are able to complete indexing of the fast indexing fast, it means that we can process the slow index (HeavyMapReduce) with more resources. So even this slowpoke completes about 15% faster with RTP than without it.

We are still running tests, but even so, this is quite exciting.

Tags:

Published at

Originally posted at

Comments (6)

Project Tamar

I’m happy to announce that despite the extreme inefficiencies involved in the process, the performance issues and what are sure to be multiple stop ship bugs in the way the release process is handled. We have successfully completed Project Tamar.

The result showed up as 2.852 Kg bundle, and is currently sleeping peacefully. I understand that this is a temporary condition, and lack of sleep shall enthuse shortly. I’ll probably regret saying that, but I can’t wait.

 

This little girl is named Tamar and to celebrate, I’m offering a 28.52% discount for all our products. This include:

Just use coupon code: Tamar

This will be valid for the next few days.

RavenDB 3.0–New stable release

We have a new build out, build 3660, which contain a lot of fixes and some really nice stuff for RavenDB users.

You can get it here.

The new stuff is quite cool.

Global

  • Changed default data location (Now will go to C:\Raven).
  • Various performance optimizations across both server and client.

RavenFS

  • Versioning support for RavenFS
  • Better metadata queries for RavenFS
  • Smuggler support for RavenFS

RavenDB

  • Client side indexes will automatically specify sort orders
  • Prevented high CPU and excessive GC runs under low memory conditions
  • Avoid leaking resources when failing to create a database.
  • Faster JSON serialization and deserialization
  • Allow to lock and set index priorities via the client API
  • Added backoff strategy for failing periodic exports
  • Recognize Windows users with admin rights to system database as server admins
  • Facets can now have very large number of facets
  • Replication will now replicate indexes between nodes (and gossip about usage)

We are now aiming at RavenDB 3.5, which should result in some really great stuff showing up.

Tags:

Published at

Originally posted at

Comments (5)

New RavenDB 3.0 RC is out

RavenDB build 3651 is out, and it is marked as RC for stable release. You can download it here, and we intend to release it next week as a stable release.

It is currently going through additional QA processes, and it is the build that is currently running all our productions systems (including this site). Please take it for a spin.

Tags:

Published at

Originally posted at

Strangest support calls for RavenDB

We have a support hotline for RavenDB, and usually we get people that give us “good” problems to solve. And then we have the people who… don’t.

The following are a few of the strangest issues from the past month or so.

  • OutOfMemoryException is thrown when running RavenDB in 32 bits mode, and documents of size of 50-70MB size are used.

Solution – When running in 32 bits mode, RavenDB only have 2GB of virtual memory to work with, and that is really not enough to do much. There is no reason today for any server app to run in 32bits mode. Also, a 70MB document?! Seriously!

  • Very slow startup time for RavenDB when the number of indexes approaches 20,000.

Solution – That isn’t a typo, honest. We had a customer that sent us a database that had 19,273 indexes in it. When they restarted the database, we had to load all of those indexes, and that took a… while. And no, those weren’t dynamic indexes, they were static indexes that (I hope, at least) were probably generate by a tool.

  • Index creation can take a long time when the number of map indexes in a multi map index is higher than 150.

Solution – Are you trying to be funny?! What it is that you are doing?

  • Index creation can take a long time when the size of the index definition is greater than 16KB.

Solution – that is a single index definition that goes on for roughly 3,000 lines. You are doing things wrong.

 

What is the worst thing that you have seen?

Tags:

Published at

Originally posted at

Comments (17)

Fixing the index, solutions

In my previous post, I showed a pretty trivial index and asked how to efficiently update it. Efficient being time & memory wise.

The easiest approach to do that is by using a reverse lookup option. Of course, that means that we actually need to store about twice as much data as before.

Given the following documents:

  • users/21 – Oren Eini
  • users/42 – Hibernating Rhinos
  • users/13 – Arava Eini

Previously, we had:

Term Documents
Oren users/21,
Eini users/21, users/13
Hibernating users/42,
Rhinos users/42,
Arava users/13

And with the reverse lookup, we have:

Term Documents Document Terms
Oren users/21,   users/21 Oren, Eini
Eini users/21, users/13   users/42 Hibernating, Rhinos
Hibernating users/42   users/13 Arava, Eini
Rhinos users/42      
Arava users/13      

And each update to the index would first do a lookup for the document id, then remove the document id from all the matching terms.

The downside of that is that we take about twice as much room. The upside is that all the work is done during indexing time, and space is pretty cheap.

It isn’t that cheap, though. So we want to try something better.

Another alternative is to introduce a level of indirection, like so:

Term Documents   Num Id
Oren 1,   1 users/21
Eini 1,3   2 users/42
Hibernating 2   3 users/13
Rhinos 2      
Arava 3      

Now, let us say that we want to update users/13 to be Phoebe Eini, we will end up with:

Term Documents   Num Id
Oren 1,   1 users/21
Eini 1,3,4   2 users/42
Hibernating 2   4 users/13
Rhinos 2      
Arava 3      
Phoebe 4      

We removed the 3rd document, and didn’t touch the terms except to add to them.

That gives us a very fast way to add to the system, and if someone will search for Arava, we will see that  the number no longer exists, so we’ll return no results for the query.

Of course, this means that we have to deal with garbage in the index, and have some way to clean it up periodically. It also means that we don’t have a way to really support Update, instead we have just Add and Delete operations.

Tags:

Published at

Originally posted at

Comments (4)

Interview question: fix the index

This is something that goes into the “what to ask a candidate”.

Given the following class:

public class Indexer
{
    private Dictionary<string, List<string>> terms = 
        new Dictionary<string, List<string>>(StringComparer.OrdinalIgnoreCase);

    public void Index(string docId, string text)
    {
        var words = text.Split();
        foreach (var term in words)
        {
            List<string> val;
            if (terms.TryGetValue(term, out val) == false)
            {
                val = new List<string>();
                terms[term] = val;
            }
            val.Add(docId);
        }
    }

    public List<string> Query(string term)
    {
        List<string> val;
        terms.TryGetValue(term, out val);
        return val ?? new List<string>();
    }
}

This class have the following tests:

public class IndexTests
{
    [Fact]
    public void CanIndexAndQuery()
    {
        var index = new Indexer();
        index.Index("users/1", "Oren Eini");
        index.Index("users/2", "Hibernating Rhinos");

        Assert.Contains("users/1", index.Query("eini"));
        Assert.Contains("users/2", index.Query("rhinos"));
    }

    [Fact]
    public void CanUpdate()
    {
        var index = new Indexer();
        index.Index("users/1", "Oren Eini");
        //updating
        index.Index("users/1", "Ayende Rahien");

        Assert.Contains("users/1", index.Query("Rahien"));
        Assert.Empty(index.Query("eini"));
    }
}

The first test passes, but the second fails.

The task is to get the CanUpdate test to pass, while keeping memory utilization and CPU costs as small as possible. You can change the internal implementation of the Indexer as you see fit.

After CanUpdate is passing, implement a Delete(string docId) method.

Tags:

Published at

Originally posted at

Comments (27)

Timeouts, TCP and streaming operations

We got a bug report in the RavenDB mailing list that was interesting to figure out.  The code in question was:

foreach(var product in GetAllProducts(session)) // GetAllProducts is implemented using streaming
{
  ++i;
  if (i > 1000)
  {
    i = 0;
    Thread.Sleep(1000);
  }
}

This code would cause a timeout error to occur after a while. The question is why? We can assume that this code is running in a console application, and it can take as long as it wants to process things.

And the server is not impacted from what the client is doing, so why do we have a timeout error here? The quick answer is that we are filling in the buffers.

GetAllProducts is using the RavenDB streaming API, which push the results of the query to the client as soon as we have anything. That lets us parallelize work on both server and client, and avoid having to hold everything in memory.

However, if the client isn’t processing things fast enough, we run into an interesting problem. The server is sending the data to the client over TCP. The client machine will get the results, buffer them and send them to the client. The client will read them from the TCP buffers, then do some work (in this case, just sleeping). Because the rate in which the client is processing items is much smaller than the rate in which we are sending them, the TCP buffers become full.

At this point, the client machine is going to start dropping TCP packets. It doesn’t have any more room to put the data in, and the server will send it again, anyway. And that is what the server is doing, assuming that we have a packet loss over the network. However, that will only hold up for a while, because if the client isn’t going to recover quickly, the server will decide that it is down, and close the TCP connection.

At this point, there isn’t any more data from the server, so the client will catch up with the buffered data, and then wait for the server to send more data. That isn’t going to happen, because the server already consider the connection lost. And eventually the client will time out with an error.

A streaming operation require us to process the results quickly enough to not jam the network.

RavenDB also have the notion of subscriptions. With those, we require explicit client confirmation from the client to send the next batch, so a a slow client isn’t going to cause issues.

Tags:

Published at

Originally posted at

Comments (5)

RavenDB in Siberia

A good chunk of the RavenDB Core Team is going to be in CodeFest in Novosibirsk this weekend, including yours truly.

We are going to be handing out a lot of cool stuff, and we got some really nice things to talk about. And yes, I’ll keep the suspense until I get to meet people face to face.

Tags:

Published at

Originally posted at

Merge related entities using Multi Map/Reduce

A question came up in the mailing list regarding searching across related entities. In particular, the scenario is the notion of a player and characters in MMPROG game.

Here is what a Player document looks like:

{
  "Id": "players/bella@dona.self",
  "Name": "Bella Dona",
  "Billing": [ { ... }, { ... }],
  "Adult": false,
  "LastLogin": "2015-03-11"
}

And a player have multiple character documents:

{
  "Id": "characters/1234",
  "Name": "Black Dona",
  "Player": "players/bella@dona.self",
  "Race": "DarkElf",
  "Level": 24,
  "XP": 283831,
  "HP": 438,
  "Skills": [ { ... } , { ... } ]
}
{
  "Id": "characters/1321",
  "Name": "Blue Bell",
  "Player": "players/bella@dona.self",
  "Race": "Halfling",
  "Level": 2,
  "XP": 2831,
  "HP": 18,
  "Skills": [ { ... } , { ... } ]
}
{
  "Id": "characters/1143",
  "Name": "Brown Barber",
  "Player": "players/bella@dona.self",
  "Race": "WoodElf",
  "Level": 44,
  "XP": 983831,
  "HP": 718,
  "Skills": [ { ... } , { ... } ]
}

And what we want is an output like this:

{
    "Id" : "players/bella@dona.self",
    "Adult": false,
    "Characters" : [
        { "Id": "characters/1234",  "Name": "Black Dona" },
        { "Id": "characters/1321",  "Name": "Blue Bell" },
        { "Id": "characters/1143",  "Name": "Brown Barberl" },
    ]
}

Now, a really easy way to do that would be to issue two queries. One to find the player, and another to find its characters. That is actually the much preferred method to do this. But let us say that we need to do something that uses both documents types.

Give me all the players who aren’t adults that have a character over 40, for example. In order to do that, we are going to use a multi map reduce index to merge the two together. Here is how it is going to look like:

// map - Players

from player in docs.Players
select new 
{
  Player = player.Id,
  Adult = player.Adult,
  Characters = new object[0]
}

// map - Characters

from character in docs.Characters
select new
{
   character.Player,
   Adult = false,
   Characters = new [] 
   { 
     new { character.Id, character.Name }
   }
}

// reduce

from result in results
group result by result.Player into g
select new
{
   Player = g.Key,
   Adult = g.Any(x=>x.Adult),
   Characters = g.SelectMany(x=>x.Characters)
}

This gives you all the details, in a single place. And you can start working on queries from there.

Tags:

Published at

Originally posted at

Comments (3)

Taking full dumps for big IIS apps

If your application is running on IIS, you are getting quite a lot for free. To start with, monitoring and management tools are right there out of the box. You are also getting some… other effects.

In particular, we had RavenDB running inside IIS that exhibit a set of performance problems in a couple of nodes (and just on those nodes). We suspected that this might be related to memory usage, and we wanted to take a full process dump so we can analyze this offline.

Whenever we tried doing that, however, the process would just restart. The problem was that to reproduce this we had to wait for a particular load pattern to happen after the database was live for about 24 hours. So taking the dump at the right time was crucial. Initially we thought we used the wrong command, or something like that. The maddening this was, when we tried it on the same machine, using the same command, without the performance issue present, it just worked (and told us nothing).

Eventually we figured out that the problem was in IIS. Or, to be rather more exact, IIS was doing its job.

When the performance problem happened, the memory usage was high. We then needed to take a full process dump, which meant that we had to write a lot. IIS didn’t hear from the worker process during that time (since it was currently being dumped), and it killed it, creating a new one.

The solution was to ask IIS to not do that, the configuration is available in the advanced settings for application pool. Note that just changing that would force IIS to restart the process, which was another major annoyance.

image

Tags:

Published at

Originally posted at

Comments (2)

QCon London and In The Brain talk – Performance Optimizations in the wild

The RavenDB Core Team is going to be in the QCon London conference this week, so if you are there, stop by our booth, we got a lot of cool swag to give out and some really cool demos.

In addition to that, on Thursday I’m going to be giving an In The Brain talk about Performance Optimizations in the Wild, talking about the kind of performance work we have been doing recently.

The results of this work can be shown on the following graph:

image

Come to the talk to hear all about the details and what we did to get things working.

Published at

Originally posted at

Comments (3)

That ain’t going to take you anywhere

As part of our usual work routine, we field customer questions and inquiries. A pretty common one is to take a look at their system to make sure that they are making a good use of RavenDB.

Usually, this involves going over the code with the team, and making some specific recommendations. Merge those indexes, re-model this bit to allow for this widget to operate more cleanly, etc.

Recently we had such a review in which what I ended up saying is: “Buy a bigger server, hope this work, and rewrite this from scratch as fast as possible”.

The really annoying thing is that someone who was quite talented has obviously spent a lot of time doing a lot of really complex things to end up where they are now. It strongly reminded me of this image:

image

At this point, you can have the best horse in the world, but the only thing that will happen if it runs is that you are going to be messed up.

What was so bad? Well, to start with, the application was designed to work with a dynamic data model. That is probably also why RavenDB was selected, since that is a great choice for dynamic data.

Then the designers sat down and created the following system of classes:

public class Table
{
	public Guid TableId {get;set;}
	public List<FieldInformation> Fields {get;set;}
	public List<Reference> References {get;set;}
	public List<Constraint> Constraints {get;set;}
}

public class FieldInformation
{
	public Guid FieldId {get;set;}
	public string Name {get;set;}
	public string Type {get;set;}
	public bool Required {get;set;}
}

public class Reference
{
	public Guid ReferenceId {get;set;}
	public string Field {get;set;}
	public Guid ReferencedTableId {get;set;}
}

public class Instance
{
	public Guid InstanceId {get;set;}
	public Guid TableId {get;set;}
	public List<Guid> References {get;set;}
	public List<FieldValue> Values {get;set;}
}

public class FieldValue
{
	public Guid FieldId {get;set;}
	public string Value {get;set;}
}

I’ll let you draw your own conclusions about how the documents looked like, or just how many calls you needed to load a single entity instance.

For that matter, it wasn’t possible to query such a system directly, obviously, so they created a set of multi-map/reduce indexes that took this data and translated that into something resembling a real entity, then queried that.

But the number of documents, indexes and the sheer travesty going on meant that actually:

  • Saving something to RavenDB took a long time.
  • Querying was really complex.
  • The number of indexes was high
  • Just figuring out what is going on in the system was nigh impossible without a map, a guide and a lot of good luck.

Just to cap things off, this is a .NET project, and in order to connect to RavenDB they used direct REST calls using HttpClient. Blithely ignoring all the man-decades that were spent in creating a good client side experience and integration. For example, they made no use of Etags or Not-Modified-Since, so a lot of the things that RavenDB can do (even under such… hardship) to make things better weren’t supported, because the client code won’t cooperate.

I don’t generally say things like “throw this all away”, but there is no mid or long term approach that could possibly work here.

Lambda methods and implicit context

The C# compiler is lazy, which is usually a very good thing, but that can also give you some issues. We recently tracked down a memory usage issue to code that looked roughly like this.

var stats = new PerformatStats
{
    Size = largeData.Length
};
stats.OnCompletion += () => this.RecordCompletion(stats);

Write(stats, o =>
{
    var sp = new Stopwatch();
    foreach (var item in largeData)
    {
        sp.Restart();
        // do something with this
        stats.RecordOperationDuration(sp);
    }
});

On the surface, this looks good. We are only using largeData for a short while, right?

But behind the scene, something evil lurks. Here is what this actually is translated to by the compiler:

__DisplayClass3 cDisplayClass3_1 = new __DisplayClass3
{
    __this = this,
    largeData = largeData
};
cDisplayClass3_1.stats = new PerformatStats { Size = cDisplayClass3_1.largeData.Length };

cDisplayClass3_1.stats.OnCompletion += new Action(cDisplayClass3_1, __LambdaMethod1__);

Write(cDisplayClass3_1.stats, new Action(cDisplayClass3_1, __LambdaMethod2__));

You need to pay special attention to what is going on. We need to maintain the local state of the variables. So the compiler lift the local parameters into an object. (Called __DisplayClass3).

Creating spurious objects is something that we want to avoid, so the C# compiler says: “Oh, I’ve two lambdas in this call that need to get access to the local variables. Instead of creating two objects, I can create just a single one, and share it among both calls, thereby saving some space”.

Unfortunately for us, there is a slight issue here. The lifetime of the stats object is pretty long (we use it to report stats). But we also hold a reference to the completion delegate (we use that to report on stuff later on). Because the completion delegate holds the same lifted parameters object, and because that holds the large data object. It means that we ended up holding a lot of stuff in memory far beyond the time they were useful.

The annoying thing is that it was pretty hard to figure out, because we were looking at the source code, and the lambda that we know is likely to be long running doesn’t look like it is going to hold a referece to the largeData object.

Ouch.

Linux, Debts and Out Of Memory Killer

Imagine that you go to the bank, and ask for a 100,000$ mortgage. The nice guy in the bank agrees to lend you the money,  and since you need to pay that in 5 installments, you take 15,000$ to the contractor, and leave the rest in the bank until it is needed. The bank is doing brisk business, and promise a lot of customers that they can get their mortgage in the bank. Since most of the mortgages are also taken in installments, the bank never actually have enough money to hand over to all lenders. But it make do.

Until one dark day when you come to the bank and ask for the rest of the money, because it is time to install the kitchen cabinets, and you need to pay for that. The nice man in the bank tell you to wait a bit, and goes to see if they have any money. At this point, it would be embarrassing to tell you that they don’t have any money to give you, because they over committed themselves. The nice man from the bank murders you and bury your body in the desert, to avoid you complaining that you didn’t get the money that you were promised.  Actually, the nice man might go ahead and kill someone else (robbing them in the process), and then give you their money. You go home happy to your blood stained kitchen cabinets.

That is how memory management works in Linux.

After this dramatic opening, let us get down to what is really going on. Linux has a major problem. Its process model means that it is stuck up a tree and the only way down is via free fall. Whenever a process wants to create another process, the standard method in Linux is to call fork() and then call execv() to execute the new binary. The problem here is what fork() does. It needs to copy the entire process state to the new process. That include all memory, handles, registers, etc.

Let us assume that we have a process that allocated 1GB of memory for reading and writing, and then called fork(). The way things are setup, it is pretty cheap to create the new process, all we need to do is duplicate the kernel data structures and we are done. However, what happens when the memory that the process allocated? The fork() call requires that both processes will have access to that memory, and also that both of them may modify it. That means that we have a copy on write situation. Whenever one of the processes modify the memory, it is forcing the OS to copy that piece of memory to another physical memory location and remap the virtual addresses.

This allows the developer to do some really cool stuff. Redis implemented its backup strategy via the fork() call. By forking and then dumping the in memory process state to disk it can get consistent snapshot of the system with almost no code. It is the OS that is responsible for maintaining that invariant.

Speaking of invariants, it also means that there is absolutely no way that Linux can manage memory properly. If we have 2 GB of RAM on the machine, and we have a 1GB process that fork()-ed, what is going to happen? Well, it was promised 1 GB of RAM, and it got that. And it was also promised by fork() that both processes will be able to modify the full 1GB of RAM. If we also have some other processes taking memory (and assuming no swap for the moment), that pretty much means that someone is going to end up holding the dirty end of the stick.

Now, Linux has a configuration option that would prevent it (vm.overcommit_memory = 2, and the over commit ratio, but that isn’t really important. I’m including this here for the nitpickers, and yes, I’m aware that you can set oom_adj = –17 to protect myself from this issue, not the point.). This tell Linux that it shouldn’t over commit. In such cases, it would mean that the fork() method call would fail, and you’ll be left with an effectively a crippled system. So, we have the potential for a broken invariant. What is going to happen now?

Well, Linux promised you memory, and after exhausting all of the physical memory, it will start paging to swap file. But that can be exhausted to. That is when the Out Of Memory Killer gets to play, and it takes an axe and start choosing a likely candidate to be mercilessly murdered. The “nice” thing about this is that there is no control over that, and you might be a perfectly well behaved process that the OOM just doesn’t like this Monday, so buh-bye!

Looking around, it seems that we aren’t the only one that had run head first into this issue. The Oracle recommendation is to set things up to panic and reboot the entire machine when this happens, and that seems… unproductive.

The problem is that as a database, we aren’t really in control of how much we allocate, and we rely on the system to tell us when we do too much. Linux has no facility to do things like warn applications that memory is low, or even letting us know that by refusing to allocate more memory. Both are things that we already support, and would be very helpful.

That is quite annoying.

Tags:

Published at

Originally posted at

Comments (9)

Picture of the day, Rhino & Raven

We have an amateur photographer in the office, who like to arrange the rhinos in the office (now exceeding 100, I think) in various ways.

I really like this picture.

Published at

Originally posted at

Comments (2)

Buffer Managers, production code and alternative implementations

We are porting RavenDB to Linux, and as such, we run into a lot of… interesting issues. Today we run into a really annoying one.

We make use of the BufferManager class inside RavenDB to reduce memory allocations. On the .Net side of things, everything works just fine, and we never really had any issues with it.

On the Mono side of things, we started getting all sort of weird errors. From ArgumentOutOfRangeException to NullReferenceException to just plain weird stuff. That was the time to dig in and look into what is going on.

On the .NET side of things, BufferManager implementation is based on a selection criteria between large (more than 85Kb) and small buffers. For large buffers, there is a single large pool that is shared among all the users of the pool. For small buffers, the BufferManager uses a pool per active thread as well as a global pool, etc. In fact, looking at the code we see that it is really nice, and a lot of effort has been made to harden it and make it work nicely for many scenarios.

The Mono implementation, on the other hand, decides to blithely discard the API contract by ignoring the maximum buffer pool size. It seems because “no user code is designed to cope with this”. Considering the fact that RavenDB is certainly dealing with that, I’m somewhat insulted, but it seems par the course for Linux, where “memory is infinite until we kill you”* is the way to go.

But what is far worse is that this class is absolutely not thread safe. That was a lot of fun to discover. Considering that this piece of code is pretty central for the entire WCF stack, I’m not really sure how that worked. We ended up writing our own BufferManager impl for Mono, to avoid those issues.

* Yes, somewhat bitter here, I’ll admit. The next post will discuss this in detail.

Long running async and memory fragmentation

We are working on performance a lot lately, but performance isn’t just an issue of how fast you can do something, it is also an issue of how many resources we use while doing that. One of the things we noticed was that we are using more memory than we would like to, and even after we were freeing the memory we were using. Digging into the memory usage, we found that the problem was that we were suffering from fragmentation inside the managed heap.

More to the point, this isn’t a large object heap fragmentation, but actually fragmentation in the standard heap. The underlying reason is that we are issuing a lot of outstanding async I/O requests, especially to serve things like the Changes() API, wait for incoming HTTP requests, etc.

Here is what this looks like inside dotProfiler.

image

As you can see, we are actually using almost no memory, but heap fragmentation is killing us in terms of memory usage.

Looking deeper, we see:

image

We suspect that the issue is that we have pinned instances that we sent to async I/O, and that match what we have found elsewhere about this issue, but we aren’t really sure how to deal with it.

Ideas are more than welcome.