Ayende @ Rahien

Oren Eini aka Ayende Rahien CEO of Hibernating Rhinos LTD, which develops RavenDB, a NoSQL Open Source Document Database.

You can reach me by:

oren@ravendb.net

+972 52-548-6969

, @ Q j

Posts: 6,887 | Comments: 49,274

filter by tags archive
time to read 3 min | 474 words

RavenDB, as of 4.0, requires that the document identifier will be a string. In fact, that has always been the requirement, but in previous versions, we allowed you to pretend that this isn’t the case. That has led to… some complexities, because people had a number id in their model, but inside RavenDB that was represented as a string, always.

I just got the following question:

In my entities, can I have the Id property of any type instead string to avoid primitive obsession? I would use a generic Id<tentity> type for ids. This type can be converted into string before saving in DB by calling ToString() and transformed from string into Id<tentity> (when fetching from DB) by invocation of static method like public Id<tentity> FromString(string id).

The short answer for this is that no, there is no way to do this. A document id in your model has to be a string.

The longer answer is that you can absolutely do this, but you have to understand the divergence of your entity model vs. the document model. The key is that RavenDB doesn’t actually require that your model would have an Id property. It is usually defined, because it makes things easier, but that isn’t required. RavenDB is perfectly happy managing the document key internally. Combine that with the ability to modify how documents are converted to entities, and you have a solution. Let’s look at the code…

And here is how it looks like:

image

The idea is that we customize a few things inside of RavenDB.

  • We tell the serializer that it should ignore the UserId property
  • We tell RavenDB that after creating an entity from the server, we should setup the Id property as we want it.
  • We do the same just before we store the entity in the server, just to be sure that we got the complete package.
  • We disable the usual identity generation logic for the documents we care about and tell RavenDB that it should ignore trying to set the identity property on the document on its own.

The end result is that we have an entity with a strongly typed identifier in our model. It took a bit of work, but not overly so.

That said, I would suggest that you should either have a string identifier property in your model or not have one at all (either option takes no code in RavenDB). Having an identifier and jumping through hoops like that tend to make for awkward experience. For example, RavenDB has no idea about this property, so if you need to support queries as well, you’ll need to extend the query support. It’s possible, but shows that there is additional complexity that can be avoided.

time to read 2 min | 256 words

RavenDB is highly concurrent distributed database. That means that we take the idea of race conditions, multiple that by network hiccups and then raise to the power of hair pulling. Now, we have architectural structure to help with a lot of that, but sometimes you need to write and verify what happens when a particular sequence of events in a five node cluster happens. For fun, you may need to orchestrate a particular order of operations across multiple disparate processes (sometimes on different machines). As you can imagine, that is… challenging.

I wanted to give you a hint of some of the techniques that we use to handle this. We have code that looks like this, sprinkled throughout our code base (Rachis is the name of our Raft cluster implementation):

This is where a leader connects to a follower to setup their relationship:

image

This is called during leader election:

image

These methods are implemented in the following manner:

image

In other words, they will set a ManualResetEvent that we setup as part of our testing infrastructure. The code isn’t even being run on production release, but it allow us to very carefully structure the exact sequence of events that we need to expose specific behaviors in the system.

time to read 2 min | 346 words

I run into this post, in which the author describe how they got ERROR 1000294 from IBM DataPower Gateway as part of an integration effort. The underlying issue was that he sent JSON to the endpoint in an order that it wasn’t expected.

After asking the team at the other end to fix it, the author got back an estimation of effort for 9 people for 6 months (4.5 man years!). The author then went and figured out that the fix for the error was somewhere deep inside DataPower:

Validate order of JSON? [X]

The author then proceeded to question the competency  / moral integrity of the estimation.

I believe that the author was grossly unfair, at best, to the people doing the estimation. Mostly because he assumed that unchecking the box and running a single request is a sufficient level of testing for this kind of change. But also because it appears that the author never considered once what is the reason this setting may be in place.

  • The sort order of JSON has been responsible for Remote Code Execution vulnerabilities.
  • The code processing the JSON may not do that in a streaming fashion, and therefor expect the data in a particular order.
  • Worse, the code may just assume the order of the fields and access them by index. Change the order of the fields, and you may reverse the Creditor and Debtor fields.
  • The code may translate the JSON to another format and send it over to another system (likely, given the mentioned legacy system.

The setting is there to protect the system, and unchecking that value means that you have to check every single one of the integration points (which may be several layers deep) to ensure that there isn’t explicit or implied ordering to the JSON.

In short, given the scope and size of the change:  “Fundamentally alter how we accept data from the outside world”, I can absolutely see why they gave this number.

And yes, for 99% of the cases, there isn’t likely to be any different, but you need to validate for that nasty 1% scenario.

time to read 4 min | 673 words

mimalloc is a memory allocator that is small and efficient, at least so the docs say. Which was interesting enough for me to take a look. We have had to do a lot of work in memory allocation inside RavenDB, and looking into how other people are doing that is always interesting. What was really attractive for me here was the fact that this is a small codebase, so I can go over that fairly quickly, and the amount of complexity involved is going to be limited.

As usual, I’m just going over the code, recording my impressions.

I started by looking at the API in mimalloc.h, and while it seems threatening, pretty much all of details here are around making it clear to the compiler what this code is doing, to enable additional optimizations.

image

If you ignore all of these, this is just a pretty normal function declaration with the usual malloc signature.

The code is very well commented, but it looks like it is going to take a serious inspection to actually figure out what is going on. I started by going over the header files, and they show some tantalizing details, but I’m missing context that I assume that I’ll get when I’ll go over the actual code.

The code talks about segments and blocks. I think that a segment is a chunk of memory that we allocated from the OS, and a block is a chunk of memory that we allocate to users of mimalloc. I like that there is an explicit model here of multi threading. There is a heap per thread, it seems, but you can free memory from another thread as well. This match very closely what we have done with RavenDB’s internal memory management.

I started to read the OS specific parts of the code, and I hit gold (as in, stuff that is really interesting to read) almost immediately:

image

There is a long discussion that details a lot of really fascinating details here, including certain level of “are you really going to go there?! OMG, you went there!”. For example, in order to patch malloc(), mimalloc need to patch atexit() to ensure that the process can shutdown normally.

I started reading the init.c file, and it isn’t about memory management at all, it is all about integrating mimalloc into the system, and that is quite fascinating on its own.

Here is how the memory is patched on ARM (similar code exists for X86 and X64):

image

What you see here is building of raw assembly instructions to do a task. I do wonder how this works, given that I would expect usual executable memory to be non writable, but I’ll look at that in a bit.

Then we have the actual list of functions to patch:

image

The last lines are scary, I have to admit.

And I found how the code deals with the memory protection:

image

I’m currently doing what is effectively random reads throughout the codebase, mostly because this is close to midnight and I’m not going to really be able to grok anything. I run into this function:

image

Nothing “special”, but it did lead me to some interesting articles about hashing, and what they are good for, how to build them, etc.

One of the reasons that I love doing these code reviews is that I learn so much more than what I expected to.

For example, the way mimalloc initialize its random is really quite interesting (see: _mi_random_init()) and elegant.

time to read 4 min | 609 words

About five years ago, my wife got me a present, a FitBit. I didn’t wear a watch for a while, and I didn’t really see the need, but it was nice to see how many steps I took and we had a competition about who has the most steps a day. It was fun. I had a few FitBits since then and I’m mostly wearing one. As it turns out, FitBit allows you to get an export of all of your data, so a few months ago I decided to see what kind of information I have stored there, and what kind of data I can get from it.

The export process is painless and I got a zip with a lot of JSON files in it. I was able to process that and get a CSV file that had my heartrate over time. Here is what this looked like:

image

The file size is just over 300MB and it contains 9.42 million records, spanning the last 5 years.

The reason I looked into getting the FitBit data is that I’m playing with timeseries right now, and I wanted a realistic data set. One that contains dirty data. For example, even in the image above, you can see that the measurements aren’t done on a consistent basis. It seems like ten and five second intervals, but the range varies.  I’m working on a timeseries feature for RavenDB, so that was perfect testing ground for me. I threw that into RavenDB and I got the data to just under 40MB in side.

I’m using Gorilla encoding as a first pass and then LZ4 to further compress the data. In a data set where the duration between measurement is stable, I can stick over 10,000 measurements in a single 2KB segment. In the case of my heartrate, I can store an average of 672 entries in each 2KB segment. Once I have the data in there, I can start actually looking at interesting patterns.

For example, consider the following query:

image

Basically, I want to know how I’m doing on a global sense, just to have a place to start figuring things out. The output of this query is:

image

These are interesting numbers. I don’t know what I did to hit 177 BPM in 2016, but I’m not sure that I like it.

What I do like is this number:

image

I then run this query, going for a daily precision on all of 2016:

image

And I got the following results in under 120 ms.

image

These are early days for this feature, but I was able to take that and generate the following (based on the query above).

image

All of the results has been generated on my laptop, and we haven’t done any performance work yet. In fact, I’m posting about this feature because I was so excited to see that I got queries to work properly now. This feature is early stages yet.

But it is already quite cool.

time to read 7 min | 1227 words

imageProduction ready code is a term that I don’t really like. I much prefer the term: Production Ready System. This is because production readiness isn’t really a property of a particular piece of code, but of the entire system.

The term is often thrown around, and usually it is referred to adding error handling and robustness to a piece of code. For example, let’s take an example from the Official Docs:

This kind of code is obviously not production ready, right? Asked to review it, most people would point out the lack of error handling if the request fails. I asked on twitter about this and got some good answers, see here.

In practice, to make this piece of code production worthy you’ll need a lot more code and infrastructure:

  • .NET specific - ConfigureAwait(false) to ensure this works properly with a SynchronizationContext
  • .NET specific – Http Client caches Proxy settings and DNS resolution, requiring you to replace it if there is a failure / on a timer.
  • .NET specific – Exceptions won’t be thrown from Http Client if the server sent an error back (including things like auth failures).
  • Input validation – especially if this is exposed to potentially malicious user input.
  • A retry mechanism (with back off strategy) is required to handle transient conditions, but need either idempotent requests or way to avoid duplicate actions.
  • Monitoring for errors, health checks, latencies, etc.
  • Metrics for performance, how long such operations take, how many ops / sec, how many failures, etc.
  • Metrics for the size of responses (which may surprise you).
  • Correlation id for end to end tracing.
  • Properly handling of errors – including reading the actual response from the server and surfacing it to the caller / logs.
  • Handling successful requests that don’t contain the data they are supposed to.

And these are just the stuff that pop to my head from looking at 10 lines of really simple code.

And after you have done all of that, you are still not really production ready. Mostly because if you implemented all of that in the GetProductAsync() function, you can’t really figure out what is actually going on.

These kind of operation is something that you want to have to implement once, via the infrastructure. There are quite a few libraries which does robust service handling that you can use, and using that will help, but it will only take you part way toward production ready system.

Let’s take cars and driving as an example of a system. If you’ll look at a car, you’ll find that quite a bit of the car design, constraints and feature set is driven directly by the need to handle the failure mode.

A modern car will have (just the stuff that is obvious and pops to mind):

  • Drivers – required explicit learning stage and passing competency test, limits on driving in impaired state, higher certification levels for more complex vehicles.
  • Accident prevention: ABS, driver assist and seat belt beeps.
  • Reduce injuries / death when accidents do happen – seat belts, air bags, crumple zones.
  • On the road – rumble strips, road fence, road maintenance, traffic laws, active and passive enforcement.

I’m pretty sure that anyone who actually understand cars will be shocked by how sparse my list is. It is clear, however, that accidents, their prevention and reducing their lethality and cost are a part and parcel of all design decisions on cars. In fact, there is a multi layered approach for increasing the safety of drivers and passengers. I’m not sure how comparable the safety of a car is to production readiness of a piece of software, though. One of the ways that cars compete with one another is on the safety features. So there is a strong incentive to improve there. That isn’t usually the case with software.

It usually take a few (costly) lessons about how much being unavailable costs you before you can really feel how much not being production ready costs you. And at this point, most people turn to error handling and recovery strategies. I think this is a mistake. A great read on the topic is How Complex System Fail, it is a great, short paper, highly readable and very relevant to the field of software development.

I consider a system to production ready when it has, not error handling inside a particular component, but actual dedicated components related to failure handling (note the difference from error handling), management of failures and its mitigations.

The end goal is that you’ll be able to continue execution and maintain semblance of normalcy to the outside world. That means having dedicated parts of the system that are just about handling (potentially very rare) failure modes as well as significant impact on your design. and architecture. That is not an inexpensive proposition. it takes quite a lot of time and effort to get there, and it is usually only worth it if you actually need the reliability this provides.

With cars, the issue is literally human lives, so we are willing to spend quite a lot of preventing accidents and reducing their impact. However, the level of robustness I expect from a toaster is quite different (don’t go on fire, pretty much) and most of that is already handled by the electrical system in the house.

Erlang is a good example of a language and environment that has always prioritized production availability. Erlang systems famously have 99.9999999% availability (that is nine nines). That is 32 milliseconds of downtime per year, which pretty much means less than the average GC pause in most systems. Erlang have a lot of infrastructure to support this kind of availability numbers, but that still require you to understand the whole system.

For example, if your Erlang service depends on a database, a restart of a database server (which takes 2 minutes to cycle) might very well means that your service processes will die, will be restarted by their supervisors only to die again and again. At this point, the supervisors itself give up and die, passing the buck up the chain. The usual response is to restart the supervisor again a few times, but the database is still down and we are in a cascading failure scenario. Just restarting is really effective in handling errors, but for certain failure scenarios, you need to consider how you’ll actually make it work. A database being unavailable can make your entire system cycle through its restarts options and die just as the database is back online. For that matter, what happens to all the requests that you tried to process at that time?

I have had a few conversations that went something like: “Oh, we use Erlang, that is handled”, but production readiness isn’t something that you can solve at the infrastructure level. It has a global impact on your architecture, design and the business itself. There are a lot of questions that you can’t answer from a technical point of view. “If I can’t validate the inventory status, should I accept an order or not?” is probably the most famous one, and that is something that the business itself need to answer.

Although, to be honest, the most important answer that you need from the business is a much more basic one: “Do we need to worry about production readiness, and if so, by how much?”

time to read 2 min | 301 words

The Reddit’s front page contain a list of recent posts from all communities. In most cases, you want to show posts from communities that the user is subscribe to, but at the same time, you want to avoid flooding the front page with posts from any single community. You also need this to be really fast.

It turns out that doing this in RavenDB is actually very easy. We are going to create a map/reduce index that will aggregate the few most recent posts per community, like so:

image

What this index will do is provide us with the five most recent posts in each community, as well as their date. This is an interesting example of a map/reduce index, because we are using both aggregation and fanout in the index.

The nice thing about this index is that we can project the results directly from it to the user. Let’s see how the queries will look like:

image

This is a simple query that does quite a lot. It gives us the most recent 15 posts across all the communities that the user care about, with no single community able to generate more than 5 posts. It sort them the posted date and fetch the actual posts in the same query. This is going to give you consistent performance regardless of how much data you have and how many updates your experience. The actual Reddit front page is a lot more complex, I’m sure, but this serve as a nice example of how you can do non trivial stuff in RavenDB’s indexes that simplify your life by a lot.

time to read 4 min | 760 words

I talked about some of the requirements for proper workflow design in my previous post. As a reminder, the top ones are:

  • Cater for developers, not the business analysts. (More on this later).
  • Source control isn’t optional, meaning:
    • Multiple branches
    • Can diff & review changes
    • Merging
    • Multiple people can work at the same time
  • Encapsulate complexity

This may seem like a pretty poor list, because if you are a developer, you might be taking all of these as granted. Because of that, I wanted to display a small taste from what used to be Microsoft’s primary workflow engine.

image

A small hint… this kind of system is not going to be useful for anything relating to source control, change management, collaborative work, understanding what is going on, etc.

A better solution for this would be to use a tool that can work with source control, that developers are familiar with and can handle the required complexity.

That tool is called… code.

It checks all the boxes required, naturally. But it does have a distinct disadvantage. One of the primary reasons you want to use a workflow engine of some kind is to decouple the implementation of your business from the policies of the business. Coming back to the mortgage example, how you calculate late fees payment is fixed (in the contract itself, but usually also by law and many regulations), but figuring out whatever late fees should be waived, on the other hand, is subject to the whims of the business.

That is a pretty simple example, but in most businesses, these kind of workflows adds up. You can easily end up with dozens to hundreds of different workflows without the business being too big or complex.

There is another issue, though. Code is pretty good when you need to handle straightforward tasks. A set of if statements (which is pretty much all most workflows are) are trivial to handle. But workflow has another property, they tend to be long. Not long on computer scale (seconds), but long on people scale (months and years).

The typical process of getting a loan may involve an initial submission, review by a doctor, asking for follow up documentation (rinse – repeat a few times), getting doctor appraisal and only then being able to generate a quote for the customer. Then we have a period of time in which the customer can accept, a qualifying period, etc. That can last for a good long while.

Trying to code long running processes like that require us a very unnatural approach to coding. Especially since you are likely to need to handle software updates while the workflows are running.

In short, we are in a strange position: we want to use code, because it is clear, support software development practices that are essentials and can scale up in complexity as needed. On the other hand, we don’t want to use our usual codebase for that, because we’ll have very different deployment strategies, the manner of working is very different and there is a lot more involvement of the business in what is going on there.

The way to handle that is to create a proper boundary between parts of the system. We’ll have the workflow behavior, defined in scripts, that describe the policy of the system. These tend to be fairly high level concepts and are designed explicitly for the rule of business policy behaviors. The infrastructure for that, on the other hand, is just a standard application using normal software practices, that is driven by the workflow scripts.

And by a script, I meant literally a script. As in, JavaScript.

I want to give you a sneak peak into how I envision this kind of system, but I’ll defer full discussion of what is involved to my next post.



The idea is that we use the script to define our policy, and then we use that to make decisions and invoke the next stage in the process. You might notice that we have the state variable, which is persisted between invocations. That allow us to use a programming model that is fairly common and obvious to developers. We can usually also show this, as is, to a business analyst and get them to understand what is going on easily enough. All the actual actions are abstracted. For example, life insurance setup is a completely different workflow that we invoke.

In my next post, I’m going to drill down a bit into the details of this approach and what kind of features do we need there.

time to read 5 min | 815 words

One of the most common themes I run into when talking to customers, users and sundry people in tech is the repeated desire to fire developers.

Actually, that is probably too loaded a statement. It actually come in two parts:

  • Developers usually want to focus on the interesting bits, and the business logic portions aren’t that much fun.
  • The business analysts usually want to get things done and having to get developers to do that is considered inefficient.

If only there was a tool, or a pattern, or a framework, or something that would allow the business analysts to directly specify the behavior of the system… Why, we could cut the developers from the process entirely! And speaking as a developer, that would be a huge relief.

I think the original name for that was CASE tools, and that flopped. In fact, literally every single one of the attempts to replace developers by a tool has flopped. They got such a bad rap that people keep trying to implement them using different names. Some stuff can be done fairly easily, though. WYSIWYG for GUI is well established and Wordpress and WIX, to name the two examples that come to mind immediately, show that you can have a non techie build a proper website. In fact, you can even plug in some pretty sophisticated functionality without burdening the user with too much.

But all that takes you to a point. And past that point, the drop off is harsh. Let’s take another common tool that is used to reduce the dependency on developers, SharePoint.

You pay close to double for actual developer time on SharePoint, mostly because it is so painful to work with it.

In a recent conference, I got into a conversation about business workflows and how to best implement them. You can look at the image on the right to get a good idea about what kind of process they were talking about.

To make things real, I want to take a “simple” example, of accepting a life insurance policy. Here is what the (extremely simplified) workflow looks like for issuing a life insurance policy:

image

This looks good, and it certainly should make sense to a business analyst. However, even after I pretty much reduced the process to its bare bones and even those has been filed away, this is still pretty complex. The process of actually getting a policy is actually a lot more complex. Some questions don’t require doctor evaluation (for example, smoking) and some require supplemental documentation (oh, you were hospitalized? Gimme all these records). The doctor may recommend different rates, rejecting entirely, some exceptions in the policy, etc. All of which need to be in the workflow. Actuarial tables needs to be consulted for each of those cases, etc, etc, etc.

But something like the diagram above isn’t going to be able to handle this level of complexity. You are going to get lost very quickly if you try to put so many boxes on the screen.

So you need encapsulation.

And you’ll probably want to have a way to develop these business workflows, which means that they aren’t static.

So you need source control.

And if you have a complex business process, you likely have different people working on it.

So you need to be able to review changes, and merge them.

Note that this is explicitly distinct from being able to store the data in source control. Being able to actually diff in a meaningful fashion two versions of such a process is anything but trivial. Usually you are left with diffing the raw XML / JSON that store the structure. Good luck with that.

If the workflow is complex, you need to be able to understand what is going on under various conditions.

So you need a debugger.

In fact, pretty soon you’ll realize that you’ll need quite a lot of the things that developers do. Except that your tool of choice doesn’t do that, or if they do, they do it poorly.

Another issue that even if you somehow managed to bypass all of those details, you are going to be facing the same drop that you see elsewhere with tools that attempt to get rid of developers. At some point, the complexity grows too large, and you’ll call your development team and hand of the system to them. At which point they will be stack with a very clucky tool that attempt to be quite clever and easy to use. It is also horribly limiting for a developer. Mostly because all of the “complexity” involved is in the business process itself, not in the actual complexity of what is going on.

There are better ways of handling that, and the easier among them is to just use code. That can be… surprisingly versatile.

time to read 3 min | 468 words

VBS.FileI was reminiscing about some old code that I wrote a long while ago, in the heyday of ASP and when the dotcom bubble was just starting to inflate. At the time, I was either still at high school or just graduated and I was fascinated by the ability to write web applications. I wrote quite a few of them, as I recall. Thankfully, none of them ever made it to this day and age. I remember one project in particular that I was quite proud of. I wrote a bunch of BBS / forum systems. One version used an Access file as the database. IIRC, that is literally how I learned SQL for the first time.

The other BBS system is what I’m here to talk about today. You couldn’t always get Access, and having it installed on the server was PITA. Especially given that I was pretty much limited to hosts that offered free hosting only. So I decided to write a BBS system that had no dependencies whatsoever and can be deployed on any host that could handle ASP. Note that this is ASP classic, .NET is still 2 years away from alpha status at this time and Java is for applets.

I decided that I would write everything through file I/O, but that was quite complex. I needed something that would help me out. Then I realized that I could use ASP itself to help me. Instead of having to pull data at runtime from a file, parse it, process it and so on, I could lean on ASP itself for that.

Trigger warning: This code is newly written, but I still remember the shape of it quite well. This may cause you seizures. For the full beauty of this piece of code, you need to consider that this is a very small piece of a much larger codebase (all in a single file, of course) but it is a very much a reprehensive representative example.

I’ll give you a moment to study the code. It deserve that much of your attention.

What you see here is a beautiful example of using code as data and data as code, self modifying code and some really impressive (even if I say so myself) capabilities of my past self to dig himself way into the hole.

Bonus points if you can point out all the myriad of issue that this code has. You can safely leave aside maintainability, I never had to maintain it, but over twenty year have passed, and I still remember the complexity involved in keeping all the states in my head.

And that was the first time that I actually wrote my own dedicated database.

FUTURE POSTS

  1. Inside RavenDB now available on RavenDB.Net - 15 hours from now

There are posts all the way to Sep 16, 2019

RECENT SERIES

  1. re (22):
    19 Aug 2019 - The Order of the JSON, AKA–irresponsible assumptions and blind spots
  2. Design exercise (6):
    01 Aug 2019 - Complex data aggregation with RavenDB
  3. Reviewing mimalloc (2):
    22 Jul 2019 - Part II
  4. Production postmortem (26):
    07 Jun 2019 - Printer out of paper and the RavenDB hang
  5. Reviewing Sled (3):
    23 Apr 2019 - Part III
View all series

RECENT COMMENTS

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats