Ayende @ Rahien

Hi!
My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by email or phone:

ayende@ayende.com

+972 52-548-6969

, @ Q j

Posts: 6,726 | Comments: 48,714

filter by tags archive

Graphs in RavenDBGraph modeling vs. document modeling

time to read 3 min | 445 words


One of the most important design decisions we made with RavenDB is not forcing users to explicitly create edges between documents. Instead, the edges are actually just normal properties on the documents and can be used as-is. This means that pretty much any existing RavenDB database can immediately start using graph operations, you don’t need to do anything.

The image below shows an order, using the RavenDB’s sample Northwind dataset. The highlighted portions mark the edges from this document. You can use these to traverse the graph by hopping from document to document.

image

For example, using:

image

This is easy to do, migrates well (zero cost to do so, yeah!) and usually matches nicely with what you already have. It does lead to an interesting observation. Typically, in a graph DB, you’ll model everything as a graph. But with RavenDB, you don’t need to do that.

In particular, let’s take a look at the Neo4J’s rendition of Northwind:

image

As you can see, everything is modeled as a node / edge. This is the only thing you could model it as. With RavenDB, you would typically use a domain driven model. In this case, it means that a value object, like an OrderLIne, will not have its own concrete existence. Either as a node or an edge. Instead, it will be embedded inside its root aggregate (the order).

Note that this is actually quite interesting, because it means that we need to be able to provide the ability to query on complex edges, such as the order lines. Here is how works:

image

This will give us all the discount products sold in London as well as their discount rate.

Note that in here, unlike previous queries, we use an named alias for the edge. In this case, it gives us the ability to access it properties and project the line’s Discount property to the user. This means that you can have a domain model with strong cohesion and locality, following the domain driven design principles while still being able to run arbitrary graph queries on it.  Combining this with the ability to pull data from indexes (including map/reduce) ones, you have a lot of things that you can do that used to be very hard but now are easy.

Graphs in RavenDBPre-processing the queries

time to read 4 min | 744 words

A query has two audiences: the users and the query engine. Ideally, you need to come up with a query language that would serve both. One of the early decisions that we made with the query language is that we want to be:

  • Very flexible for the user, giving them several ways to express themselves.
  • Be very rigid in the query engine, with only one way to do something.

These two requirements are directly contradicting one another, which is indeed somewhat of a problem. The key here is that we don’t want to produce multiple ways to do the same thing in the query engine. That is a great way to introduce:

  • Different actual execution plans.
  • Features that only work with a specific syntax.
  • More complexity overall.

Anyone who ever worked with the internals of Linq can attest to the complexity that is involved here.

Let’s take the simple query that we have been inspecting so far:

image

Now, let’s ask RavenDB to spit it back out for us, shall we? Here is how RavenDB thinks about this query:

image

In other words, the way RavenDB sees the query and the way the user sees the query are very different. You can see that we have the with edges clauses here, defining the edges on the query.

In other words, all of the query definitions are happening in the with and with edges clauses. When we need to actually perform the matches, the match clause only defines the graph pattern that we need to match on.  It is the responsibility of the query parser to arrange the query from the multiple ways that the user may want to define it to the single representation that is actually going to be executed by the query runner.

This may seem like a lot of ceremony, but that is only because we have a very simple query. Let’s change the “friends of friends who aren’t my friends” to something a bit more interesting: “Close friends of my close friends who aren’t my friends”. We are also going to want to limit the friends that we follow only to Users (so, for example, we’ll not follow a FriendOf link to a Pet).

Here is what the query looks like, when we use more concise syntax, and how RavenDB translates it:

image

You’ll note that even for the query above, I still used a separate with clause to make things easier, the following query is exactly the same:

image

The basic idea is that for trivial filtering, you’ll probably want to do that inline, inside the match clause. But anything more complex should go to the with clause where you can more easily express your logic.  Also note that aliases matter. The f1 and f2 here are not duplicated for no reason, part of processing the query is to bind a value to each of the aliases, and you cannot bind a single result to multiple aliases.

Another key aspect of this mode is that while this is pretty easy to follow, a with clause can contain any query. That means that you can use indexes as well, including Map/Reduce indexes. Here is one such example:

image

In this case, I”m not sure how good a graph query this is, I’ll admit, but it does a good job of demonstrating what you can do. We are taking a few queries, mixing them together and then mashing the results to find London companies who didn’t order as much as they used to.

This means that the source information for graph queries can be things like spatial queries, full text search, map/reduce, etc. A lot of the complexity in graphs queries is just getting to do the start of the graph pattern matching. With RavenDB, you have a very strong query language and facilities to help you get past that and directly into the graph operations.

This is enough about the pre-processing the query, in my next post, I’m going to go into depth into how graph queries work with document models.

Graphs in RavenDBThe query language

time to read 4 min | 781 words

Pretty much all our early discussions about graphs in RavenDB focused on how to build the actual graph implementation. How to allow fast traversal, etc. When we started looking at the actual implementation, we realized that we seriously neglected a very important piece of the puzzle, the query interface for the graphs.

This is important for several reasons. First, ergonomics matter, if we end up with a query language that is awkward, it won’t see much use and complicate the users’ lives (and our own). Second, the query language effectively dictate how the user think about the model, so making low level decisions that would have impact on how the user is actually using this feature is probably not a good idea yet. We need to start from the top, what do we give to the user, and then see how we can make that a reality.

The most common use case of graph queries is the friends of friends query. Let’s see how this query is handled in various existing implementation, shall we?

Neo4J, using Cypher:

image

OrientDB doesn’t seem to have an easy way to do this. The following shows how you can find the 2nd degree friends, but it doesn’t exclude friends of friends who are already your friends. StackOverflow questions on that show scary amount of code, so I’m going to skip them.

image

Gremlin, which is used in a wide variety of databases:

image

We looked at other options, but it seems that graph query languages fall into the following broad categories:

  • ASCII art to express the relationship between the nodes.
  • SQL extensions that express the relationships as nested queries.
  • Method calls to express the traversal.

Of the three options, we found the first option, using ASCII Art / Cypher as the easier one to work with. This is true both in terms of writing the query and actually executing it.

Let’s look at how friends of friends query will look like in RavenDB:

image

Graph queries are composed of two portions:

  • With clauses, which determine source point for the graph traversal.
  • Match clause (singular) that contain the graph pattern that we need to match on.

In the case, above, we are starting the graph traversal from start, this is defined as a with clause. A query can have multiple with clauses, each defining an alias that can be used in the match clause. The match clause, on the other hand, uses these aliases to decide how to process the query.

You can see that we have two clauses in the above query, and the actual processing is done by pattern matching (to me, it make sense to compare it to regular expressions or Prolog). It would probably be easier to show this with an example. Here is the relationship graphs among a few people:

image

We’ll set the starting point of the graph as Arava and see how this will be processed in the query.

For the first clause, we’ll have:

  • start (Arava) –> f1 (Oscar) –> f2 (Phoebe)
  • start (Arava) –> f1 (Oscar) –> f2 (Sunny)
  • start (Arava) –> f1 (Sunny) –> f2 (Phoebe)
  • start (Arava) –> f1 (Sunny) –> f2 (Oscar)

For the second clause, of the other hand, have:

  • start (Arava) –> f2 (Oscar)
  • start (Arava) –> f2 (Sunny)

These clauses are joined using and not operator. What this means is that we need to exclude from the first clause anything that matches on the second cluase. Match, in this case, means the same alias and value for any existing alias.

Here is what we need up with:

  • start (Arava) –> f1 (Oscar) –> f2 (Phoebe)
  • start (Arava) –> f1 (Oscar) –> f2 (Sunny) 
  • start (Arava) –> f1 (Sunny) –> f2 (Phoebe)
  • start (Arava) –> f1 (Sunny) –> f2 (Oscar)

We removed two entries, because they matched the entries from the second clause. The end result being just friends of my friends who aren’t my friends.

The idea with behind the query language is that we want to be high level and allow you to express what you want, and we’ll be in charge of actually making this work properly.

In the next post, I’ll talk a bit more about the query language, what scenarios it enables and how we are going to go about processing queries.

Graphs in RavenDBThe overall design

time to read 5 min | 863 words

Note: These series of posts are about a planned feature, exploring how we go about building it. This is meant to solicit feedback and get more eyes on the idea, things aren’t set in stone and we don’t have a firm release date on this.

We have been wanting to add graph queries to RavenDB for several years now, but we always had more important things get in the way. That didn’t prevent us from discussing this internally and sketch up a few options. We are now looking at this more seriously and I thought that sharing the details of our deliberations would be interesting and likely to garner us some valuable feedback. I’m going to assume that the reader is at least somewhat familiar with the notion of graph data and graph queries.

Probably the most well known graph database is Neo4J, which provides the notion of nodes and edges, both of which have a type and a set of (flat) properties. This allow you to define a model of any arbitrary complexity. This works if you model is purely graph based, but it doesn’t work for RavenDB, whose users are used to the document model. On the surface, this looks like a minor detail. RavenDB has documents, which can have any shape, including containing embedded values and collections inside them. Neo4J, on the other hand, model things differently. The simplest example that I can think of is Orders and Order Lines, where you’ll have the following models:

Neo4J

RavenDB

image image

Both models have the same information, but each element in the Neo4J graph is an independent node that is linked to the others. On the other hand, with RavenDB, we have a single document that embeds a lot of the information directly.  Note that what we haven’t shown in the image is that in RavenDB as well, you have other documents as well. The products, for example, are separate documents. 

Graph databases are often used to handle the basis of recommendation engines, fraud detection, etc. But they are usually used to augment the capabilities of the system, rather than as the primary data store of applications. RavenDB, on the other hand, is most frequently deployed as the primary data store. We want to give our users the ability to perform graph operations, but we don’t want to lose anything that make RavenDB useful and easy to use.

We initially thought about having the following definition:

  • Each document is (implicitly) a node in the graph.
  • You can call Link(src,dest,type, attributes) to create an edge between any two documents.
  • Provide the usual graph queries on top of that.

We started exploring this implementation, but it quickly led to mounting complexity. From the point of view of the user, it led to having to do additional work, you’ll have to maintain your document model and the edges at the same time. This allow you to do some interesting things, but it also likely to cause complications down the line and very likely to cause issues when the document model and graph model disagree with one another. Other issues relates to how do you handle graphs in a distributed manner. How do you deal with the creation on an edge between two documents on one node when one of them was deleted on another?

We pushed in that direction for a while, because that was the obvious thing to do, but it really turned up to be a bad idea which didn’t play well with the rest of RavenDB. The worst part was the fact that you might modify the document properties but not define the edge, which lead to inconsistency. This was very easy to do.

The next thing we played with was to remove the Link() call and allow the user to define a background operation that would go and create the links between documents automatically whenever they were updated. This would allow us to avoid having any inconsistencies between the data in the documents and the links between then. After thinking about this for a while, we went ahead with this approach, but removed the requirement for a background operations.

RavenDB will be able to use your existing document model as the graph model as well. In other words, in the model above, you have the orders/2 document, which has two links, for each of the products. This give us both the ability to have a well define document model, with its well known Domain Driven architecture and the ability to hop off all the pre-existing links that we have in the model.

I’ll discuss the querying model and how it all plays together in a future post. For now, I want to show you how this looks like when we want to do a typical graph operation, friends of friends:

image

More details will come in the next post…

Product Release Postmortem: Things You Should Never Do, Part II

time to read 18 min | 3409 words

imageThis post is the text version of a presentation I gave a few weeks ago. There is in reference of this classic post by Joel.

In 2015, I decided that we needed to reboot RavenDB. I did that with the full understanding that this is going to be a huge task, including knowing that it will be bigger than what I can project, even if I take this line of thinking into account.

RavenDB 1.0 was written a decade ago. It was written because it didn’t leave me alone and I wanted to get it out of my head. At the time, I was focused more on getting it out the door (and my head) and was taking shortcuts in the implementation. That allowed me to cut down dramatically on the amount of work that is involved in it. At the same time, this put some constraints on the implementation and architecture. The most obvious one was the reliance on Esent, which tied us to Windows. C# as the implementation language, to a lesser extent, also had the same issue until .NET Core. (Yes, I’m aware of Mono, I have no idea how people managed to run anything beyond hello world on it. We tried porting RavenDB to Mono multiple times, and I still bear the scars.)

I went back and looked at our release notes, in literally every major release, we have spent significant amount of time and effort on “performance optimizations”. In January of 2015 we had a few sprints that were dedicated to just this issue. We went down to assembly code in some cases, analyzed our hotspots and optimize things in a very serious manner. We got some amazing performance improvements in some cases, reducing the runtime by orders of magnitude in some cases. But it still felt like we were hitting a limit. What is more, experience from customers in production showed us that there were a number of cases where we run into problematic behavior. This mostly happened on large / complex projects. And nearly all those issues were related in one way or another to memory and the GC.

Our indexing, for example, would be reading data from disk into memory. That was meant to save disk I/O during indexing, and including pretty smart prefetching and monitoring behavior. It also had the side effect of loading documents (which can be large) into managed memory and holding on to them long enough to push them into Gen1 and Gen2. Then they would be indexed and need to go away. But given that they were pushed to a higher generation… that meant more expensive collection cycle.

RavenDB was created before the pervasive use of fast disks, and it turns out that in some cases, reading the data from disk was actually faster than parsing it using JSON.Net. In other words, our “I/O bound” process of reading documents was actually dominated by the time it took to parse the JSON text. That does not include the costs of actually cleaning up this memory. Complex JSON documents can have a lot of objects,  and the cost of GC rise with the number of objects that are being tracked. There were pretty fundamental problems, which I didn’t think we could fix in a piecemeal fashion.

That time also coincided with a peak in the number of support incidents that we got. Unlike many other open source projects, we treat support as a cost center, not a revenue center. In other words, we don’t want to have more support, that isn’t how we want to make money. Being a database, we were frequently at the heart of things and our customers and users are very sensitive to any issue that might arise. I’m painting somewhat of a bleak picture, I’m aware. It wasn’t nearly that bad from the point of view of any particular customer. But on aggregate, from our point of view, it felt like a nasty game of whack a mole. As soon as we provided a solution to one customer’s issue, another would pop up, somewhat related but just different enough to not be fixed by the previous change. These weren’t regressions, mind. These were just a lot of places where the changing times violated some of our core assumptions.

Toward the end of 2015, I sat down and really thought about what we needed and were missing. This was the situation as I saw it.

image

There was also the issue that we have learned a lot over the years. We built Voron (our storage engine) from the ground up, we had a lot of experience running in production and we knew what kind of tasks our customers were using us for. I kept thinking that I wished I had a time machine and could do things over properly. Given that my time machine is still in the shop, I decided that we had two options:

  1. Minor fixes along the way – slowly improving our behavior as we stride toward the desired architecture and usage.
  2. Break it all – essentially start from scratch, with a new architecture and write it the way we want it to be written.

The obvious choice was to do this slowly. The problem was that I really couldn’t think of a good way to actually achieve that. The kind of changes we wanted to make started from replacing the most fundamental structure we had, how we represent JSON in our document database and got more complex from there. We wanted to change how we store data on disk, how we index data, how we … literally every single feature that we had was going to be transformed in some way.

We also had additional issues. The Windows only limitation was really hurting us and we really wanted to get a good Linux story going. The support burden was also at the very top of my mind as we considered what to do. In the end, we came up with the following decisions:

  • We don’t require backward compatibility. Either on the server side or client side.
    • That was the hardest decision, but it meant that we could actually tackle some of the biggest issues freely and without constraint.
    • That meant that we wanted to keep the same feeling, but be able to make changes to corners of the API that atrophied.
  • Support cost and simplified operations as a primary concern.
    • This meant that, at the design level, we took into account debugging considerations.
  • Order of magnitude performance improvement across the board.
    • Otherwise, it isn’t worth the effort.
  • Cross platform from the get go.

That was in Sep 2015. I sat down and wrote a design document that outlined the new architectural approach, spiked a few things and then we were off to the races. I blogged all about the process extensively, so I’m not going to repeat that.

We decided to use DNX (which became .NET Core) at a very early stage. Initially, I don’t believe that we even had a debugger, and most of our builds had to be trigger from the command line. I guess that if you are going to make a risky decision, you might as well make a few others…

I’ll say that I made a lot of preparation to fail up front. Part of the reason we went with DNX was that we knew that worst case scenario, we could spend a few days and get it working on the full .NET framework if we had to. I took this step with a lot of backward glances to make sure that we won’t get lost.

Alongside our experience in supporting RavenDB, we also run a UX study and combed all the incident reports we generate from support calls. The idea was to take as much time as necessary to get things as right as we could handle it. The studio change between 3.5 and 4.0 is massive, and was driven by getting a talented professional to design each part of the UI, guided by real world UX study and analysis. We kept asking “where do it hurt?” and whenever we had found a cause of pain we worked to alleviate it.

Some of our guiding principals during that phase of the project were:

  • Cross platform from the get go.
    • We couldn’t afford to port it midway through. Too complex and prune to failure.
  • OWN the stack.
    • We don’t want to use any components that we don’t have good visibility into and the ability to work  with.
    • In particular, anything that is a core competence should be owned and built by us. For our scenario, that means primarily the storage engine.
  • Build for performance.
    • I wasn’t kidding about requiring x10 performance improvement. We had one or two devs at all time running benchmarks and fixing things performance of every completed feature.
  • Build for operations.
    • Each and every design decision should be considered in light of its operational behavior.
    • In particular, we excised any feature that relied on hard to figure out technology or integration (I’m looking at you, Windows Auth).
    • This included changing the design of the software so a core dump would make it easier to figure out what is going on. We also explicitly opened up a lot of the internal behavior as debug endpoints and plug them to the studio so operators will have greater visibility. As an aside, that was very helpful in figuring out our performance bottlenecks and we worked to improve that part of the project as we strived for ever faster performance.
  • Reducing the support burden as an major goal.
    • A lot of the previous points tie into this. But this is also where we combed over any issue that had a “user misconfigured / misused” and built in alerts directly into RavenDB to give the user early warnings about common issues.
  • We defined a set of common scenarios. Reading / writing documents, for example, and then we spent months on designing the whole system so it will work to make these fast, seamless and easy.

A good example of that is how we stored documents in RavenDB now. We have our own binary format that allow us to avoid parsing the document when reading from disk, plays nicely with memory mapped files (which is how Voron, our storage engine, works) and can effectively allow us to hand a pointer to a memory mapped buffer and start working with that as a JSON document without:

  • Allocating any managed memory
  • Parsing JSON
  • Require caching / pre-fetching, etc.

We spent a lot of time thinking about what we want to do, and then we looked into how the operating system expect us to behave. The idea is that if we play to the operating system’s expectation, we can reap a lot of benefits from the OS’ own behavior. This is how RavenDB handles loading data to memory. We let the operating system handle it and just make sure that our own behavior is both predictable and applicable to the OS’ optimizations.

I mentioned that GC was the bane of our existence, right? We moved a lot of the memory management in RavenDB to unmanaged code and handle that explicitly. That gave us the advantage that we know a lot more about how we should expect to use the memory an can spend the time to make this highly optimized.

At the debugger side of things, we made some changes to the design of RavenDB with the intent to make it easier to debug and analyze core dumps. For example, most of the long running threads are named, so it is easy to figure out who they belong to (and not just what they are currently doing). For that matter, long running tasks are using synchronous mode, specifically because it means that we can drop in the debugger / core dump and look at their state. This is much harder to do with async methods. You might have noticed that I mentioned core dumps a few times, right? These are essential to figuring out what is going on with your software on production systems. We learned a lot about production debugging over the years and with RavenDB 4.0 we took steps to make things easier. For example, many data structured in RavenDB have an extra field called tag that is there specifically to provide debugging information about the value if we are looking at the value in the debugger.

An obvious question for this project was whatever we should still stay on .NET or should we move to an unmanaged language. I considered this seriously, with Rust, C/C++ and Go being the top contenders as the implementation language. I decided to stay with .NET for several reasons. Productivity was right there in the top. We already had a team that was well versed in .NET, and while that isn’t a blocker, it was a consideration. The tooling around .NET are leagues ahead of anything else that I have seen. That include both write time (where Rider / ReSharper rules) and for debug time (I found nothing remotely close to Visual Studio for debugging non trivial code easily). The cross platform angle, which was the most serious issue for us, was resolved with .NET Core.

Rust wasn’t matured enough at the time (2015) and even today I think that a language that prides itself in being hard to learn isn’t a good choice. C++ was a strong contender, but the slow compilation times were an issue. The tooling is similar, but inferior in many respects. Cross platform C++ is possible, and modern C++ is very different from what I remember. However, it come with a very high degree of complexity and would take a lot of time to master again properly. C (distinct from C++) is much simpler language. Still has the compilation speed issues, but the language is much simpler. I think that if it had a defer mechanism builtin it would be a much nicer language. Go was ruled out because if I’m going to be writing everything from scratch, I might as well go all the way down to C’s level and not stop with something that still has GC pauses.

The choice of C# as CoreCLR has been vindicated. The project team and the community at large puts a large emphasis on performance and we keep getting more and better way to handle low level details while still able to use higher level concepts when needed. And the tooling… dear God. I routinely work with other platforms, testing things out, but there is nothing that come close to the toolsets that are available for C#.

An interesting wrinkle with the 4.0 release was that we started it before the 3.5 release was even out. For a while, we had a small team working on the foundations of 4.0 while the rest of us were busy hammering in the last details of RavenDB 3.5.

As soon as we had the bare minimum to go (basically, it compiled and could save a single document and even regurgitate it back up again) we started heavy parallelization of the work. We had a team working on indexes while another was dealing with (even at this early stage) performance and another working on the user interface. At that time frame, we have hired a few more people and could really see the benefits of all of the separate teams working in concrete. One of the priorities of this method was to get to a demoable state. In fact, at some point we had over 30% of the people working on either the UI directly or UI related infrastructure.

One of the things we kept hearing back is that the UI and insight it provided into what is going on inside the database were crucial for our users. It also helped a lot to us as we developed RavenDB to get to play with it directly and see things in an easy manner. The UI has been at once one of the most trivial of changes and the most profound. On the one hand, we didn’t really make any significant architectural changes in the UI. On the other hand, we re-wrote most of it with the aid of UX study and a real professional at the helm. That gave us a lot of visible polish that underscore the amount of work that happened in the engine.

How did all of that turn out?

  • I initially thought it would last a year to 15 months, with an expectant due date of Dec 2016. That was with a team size of about 25 people.  Work started in Sep 2015.
  • As it turns out, RavenDB accumulated a lot of features in the years it spent in production. We had to evaluate each of it, see how it would fit into our architecture and get it ported. That took a lot of time. Especially because it many cases we took the time to change the approach we had for the feature completely.
  • By Mid 2016 I already changed the scheduled to Jun 2017.
  • Close to the end of 2016 we released RavenDB 3.5. This freed up some people to work on the 4.0 release, but also meant we had higher than usual support calls while customers integrated the new release.
  • Actual release of the 4.0 release happened in Feb 2018. So just about 30 months from the start or about double the time I expected it to happen.
  • We had to cut some features out to make the 4.0 release, all of them are back in the 4.1 release, scheduled for next month.
    • This means that to get back to the same place took us 3 years. But we now have a lot of extra features.
    • Most of the missing features were pretty minor, though, and rarely used.

What did all of that gain us?

  • Performance: Single node. Over 100,000 writes / sec and over 1,000,000 reads / sec in our benchmarks.
    • Real world users report performance boost of x20 to x52 time faster.
  • Support call duration dropped from days / weeks to about 2 – 4 hours.
  • Cross platform on Windows, Linux, ARM and MacOSX.
    • We are now deployed to production on Raspberry PIs, because we are the fastest real database on that kind of hardware.

We were over a year overdue, and even with the deadline being extended several times we had to cut some features to actually make the cut for release. The general acceptance of the new release by the community has been a roaring success. We exceeded our own goals for the project, even if we took a lot longer than expected to get there.

Now, for some additional thoughts. We didn’t really re-write the whole thing from scratch. Instead, we had a lot of code that we could at least partially reuse. The storage engine was ported, no re-written, for example. However, we changed architectures in a pretty significant way. For example, the format and manner of working with JSON changed entirely between these two released. We are a JSON document database. As you can imagine, we pretty much had to modify everything as a result of that.

We didn’t designed the whole things from the started. We had a rough outline and we let things roll from there. As a result of the new architecture and expectation, by the time we hit a particular feature we were able to utilize what we already learn about how to work with the new architecture to improve things. We also weren’t afraid of changing things multiple times. Authentication had several major design changes midway through, and it ended up so much simpler than what we had before. Even pretty late in the game, we still made significant changes. The RQL support, having a SQL like querying language, came about on the last 20% of the project.

That was a huge change, and I got a lot of “here comes the crazy train again” feedback. This is probably one of the reasons we delayed by another few months. But it was worth it by far. Basically, because we were able to give up on backward compatibility, we were able to move quickly and change stuff as we wished. We knew that we wouldn’t have another change like that for another decade, so we try to get the big changes done.

In retrospect, I think it worked quite well. I’m really proud of how RavenDB 4.0 turned out.

Error handling via GOTO in C

time to read 3 min | 447 words

Following up on my previous post, I was asked about the use of GOTO in C for error handling.

I decided to go and look at a bunch of C code bases, and it took me no time at all to find some interesting examples of this approach.

Below, you can see a small portion from a random method in the Sqlite codebase. Sqlite is considered to be one of the best C code bases out there, in terms of code quality, number and thoroughness of tests and the quality of the project as a whole. As such I think it is a good candidate for a sample.

Here you can see there there is the need to free some resources and at the same time, handle errors. This is routinely handled using a cleanup code at the end of the function that error clauses will jump to if needed.

image

Note that this is very different from “GOTO considered harmful”. That article talked about jumping to any arbitrary location in the program. C (and all other modern languages) limit you to jump around only inside your own method. Even though methods are quite long in the Sqlite codebase, it is very easy to follow what is going on there.

Here is another example, which is a lot more complex. I have taken it from the Linux kernel code:

image

Again, I had to redact a lot of code that actually do stuff to allow us to look at the error handling behavior. Now you can see something that is a lot more complex.

There are actually multiple labels here, and they jump around between one another in the case of an error. For example, if we fail to allocate the cifs_sb we jump to out_nls, which then jump to out.  Failure to get the root will jump us to out_super and then fall into out. For out_free, it falls to out_nls and then jump to out.

In both the Sqlite example and the Linux example, we have two separate and related responsibility. We need to do resource cleanup, but that is complicated by the different stages that this may go through in the function. Given that, GOTO cleanup is the best option available, I think.

There are various defer style options for C, libdefer and defer macro looks better. But both have some runtime costs. I like the defer macro better, but that won’t work on MSVC, for example, as it require language extension.

Inside RavenDB 4.0Chapter 10 is done

time to read 1 min | 116 words

The Inside RavenDB 4.0 book just got an update, you can download the new release in the link. It currently has 10 chapters and cover a lot of materials. New in this release is in depth coverage of indexing.

Note that some of the earlier details might have changed a bit (usually configuration name changes), I’m going to go over and fix them as part of the release process.

There are also the upcoming RavenDB Workshops in San Francisco and New York next month, if you want to learn about RavenDB directly from me Smile.

We’ll also have a booth at the Developer Week conference in early February in San Francisco.

If you have a finalizer, watch your ctor

time to read 2 min | 289 words

We recently got into a very bad state with our build server. We had a few dozens failing tests in an area of the code that didn’t change in months. That was annoying, to say the least. I tried to run this on my machine, to see what was going on, and we got an even worse state, a process crash.

The good thing was that this is managed code, so you get stack traces, and we were able to figure out that the issue is with the SqlClient assembly. It looks like a recent change meant that loading it by reflection will give you the netstadnard 2.0 release, which is basically just a filler that always throw PlatformNotSupported. That is perfectly fine, we’ll change the way we do things to load the direct reference.

What killed us was this:

image

This will kill the process. How is that?

Well, DbConnection inherits from Component, and Component has a finalizer. Care to guess how what will happen when that is called?

image

So here is what will happen, the ctor throws, so no one can have an instance of this class. But the finalizer is already tracking it, and will call the finalizer. The finalizer will call the Dispose method, and that will end up throwing, and an exception during finalization is fatal to the entire process. Bye!

The general idea is that when you are working with a throwing ctor, you either make sure to call GC.SupprressFinalize or you cleanup in such a way that you can safely finalize things.

NHibernate Profiler 5.0 Alpha has been released

time to read 1 min | 181 words

imageThe next generation of our profiler is starting to shift into higher gear, with today’s release of the NHibernate Profiler 5.0 Alpha.

The new bits are available directly from NuGet.

The new release brings:

  • Support for NHibernate 5.0.0 and NHibernate 5.0.1
  • The profiler appender now require .NET Standard 2.0 (older appenders, supporting older runtimes will still work with the new release).
  • Fixes for various bugs, aches and pains.
  • Faster serialization format over the wire.
  • Reduce the cost of profiling your application significantly by reducing contention.
  • Better stack trace and profiling support for .NET standard 2.0 apps.

This is just the beginning, getting the major stuff that need to be there for the next release before we get down and start implementing the cool features. But these changes (and especially the support for NHibernate 5.x)  are major enough that we would like to have users test them out in the field.

Licensing wise, for the alpha period any 4.x license for the profiler will work on the 5.0 alpha.

You can download it here.

Cost centers, revenue centers and politics in a cross organization world

time to read 2 min | 394 words

Sometimes we get requests from customers to evaluate and help specify the kind of hardware their RavenDB servers is going to run on. One of the more recent ones was to evaluate a couple of options and select the optimal one.

We got the specs of the two variants and had a look. Then I went and took a look at the actual costs. These are physical machines, and the cost of each of the options we were given, even maxed out, was around 2 – 3K $.

One of the machines that they wanted was using a 7,200 RPM hard disk and was somewhat cheaper than the other. To the point where we got some pushback from the customer about selecting the other option (with SSD). It took me a while to figure out what was going on.

This organization is using RavenDB (and the new machines in question) to run their core business. One of the primary factors in their business in the speed in which they can serve request (for that business, SEO is super critical metric). This business is also primarily focused on getting eyes on the site, which means that their organizational structure looks like this:

image

And the behavior of the organization follow the money. The marketing & sales department is much larger and can steer the entire organization, while the tech (which the entire organization depends on) is decidedly second string, for both decision making and budgeting.

I have run into this several times before, but it took me a long while to figure out that in many cases, the result of such arrangements is that the tech department relies on other people (in this case, us) to tell the organization at large what needs to be done. “It isn’t us (the poor relation) that needs this expensive (add 300$ or so) add on, but the database guys says that it really matters (and you’ll take the blame if you don’t approve it)”.

I don’t have anything to add, I’m afraid, I just wanted to share this observation. I’m hoping that understanding the motivation can help alleviate some of the hair pulling involved in those kind of interactions (yes, water is wet, and spending a very small amount of money is actually worth it).

FUTURE POSTS

No future posts left, oh my!

RECENT SERIES

  1. Challenge (54):
    28 Sep 2018 - The loop that leaks–Answer
  2. Graphs in RavenDB (4):
    21 Sep 2018 - Graph modeling vs. document modeling
  3. Reviewing FASTER (9):
    06 Sep 2018 - Summary
  4. RavenDB 4.1 features (12):
    22 Aug 2018 - MongoDB & CosmosDB Migration Wizards
  5. Reading the NSA’s codebase (7):
    13 Aug 2018 - LemonGraph review–Part VII–Summary
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats