My .NET Core Conf talk has been posted and you can now watch it!
You can listen to me and Jeffery Palermo talk about how we actually build RavenDB.
The first part of the discussion is here.
I’ll be doing a webinar today dealing with data modeling in RavenDB. In particular, I’m going to focus on both general modeling advice and how RavenDB was explicitly designed to make it easy / simple to do the common tasks that are usually so hard to deal with.
I’m going to be talking about the shape of your documents, the shape of your entities, how to design your system for best results and the kind of hidden features behind the scenes that you might want to take advantage of.
I have been talking about memory and RavenDB a lot, and I thought that I would share the following image from one of our test runs:
This is RavenDB running in a container with 16MB of available memory. This is when we are under (moderate) load:
Note that the actual working set used by RavenDB is 2.28MB, and while the total allocations are higher than that, it is still quite reasonable in size.
In 1995, I got a new computer with 133MHz and 16 MB of RAM. It run a full OS and apps (Win95, Netscape, Office, etc) and was quite impressive.
It is really interesting that we can run RavenDB on that constrained environment.
After my podcast about RavenDB’s dev ops story, I was asked an interesting question by Remi:
…do you think it can work with non technical product (let's say banking app) where your user and your engineer are not in the same industry.
This is quite an interesting scenario. A line of business application is going to be composed of two separate planes. You have the technical plane, which is fairly standard and you can get quite a lot of mileage from standard dev ops monitoring tools. For example, you probably don’t need the same level of diagnostics in a web apps or a service backend as you need for a database engine. However, the business plane is just an interesting an area and often can benefit quite a bit by building business level diagnostics into the application.
If we’ll take the example of banking app, you might want to track things such as payment flow across various accounts. You may want to be able to get a view of a single user’s activities over time or simply have a good visibility to various financial instruments.
I have run into several cases were I had to break down how loans work (interest, compounding, collateral, etc) for college educate people who were really quite smart, but didn’t pay attention to that part of life. Given that I consider loans to be one of the simplest financial instruments, building visibility into these can be of great help.
Still in the banking field, just the notion of taxation is freakishly complex. I have had a case where a customer in India was suppose to pay us a 1,000 USD. They sent 857 USD (a bit of that was eaten by bank fees) and the rest we had to claim as a refund from my tax authorities, because the rest of the money was paid as taxes in India and the two countries are doing reconciliation. Given the inherent complexity that is involved, just being able to visual, inspect and explain things is of enormous value.
Things like Know Your Customer and Anti Money Laundering are also quite complex and can put the system into a tail spin. I had a customer send us a payment, but the payment was stopped because the same customer also paid (in a completely different transaction and to a different destination entirely) with funds that came from crypto currencies. Leaving aside the aggravation of such scenarios, I am actually impressed/scared that they are able to track such things so well.
I can’t really be upset with the bank, even. Laws and regulations are in place that have strict limits on how they can behave, including personal criminal liability and Should Have Known clauses. I can understand why they are cautious.
But at the same time, trying to untangle such a system is a lot like trying to debug a software system. And having the tools in place for the business expert to easily obtain and display the data is an absolute competitive advantage.
I have recently close a bank account specifically because the level of service provided didn’t meat my expectations. Having better systems in place means that you can give better service, and that is worth quite a lot.
The following is a fix we did to resolve a production crash of RavenDB. Take a look at the code, and consider what kind of error we are trying to handle here.
Hint: The logger class can never throw.
The underlying issue was simple. We run out of memory, which is an expected occurrence, and is handled by the very function that we are looking at above.
However, under low memory conditions, allocations can fail. In the code above, the allocation of the log string statement failed, which threw an error. This caused an exception to escape a thread boundary and kill the entire process.
Moving the log statement to the inside of the try statement allows us to recover from it, attempt to report the error, and release any currently held memory and attempt to reduce our memory utilization.
This particular error is annoying. A string allocation will always allocate, but even if you run out of actual memory, such allocations will often succeed because it can be served out of the already existing GC heap,without the need to trigger actual allocation from the OS. This is just a reminder that anything that can go wrong will, and with just the right set of circumstances to cause us pain.
I’ll use this opportunity to recommend, once again, reading How Complex Systems Fail. Even in this small example, you can see that it takes multiple separate things to align just right for an error to actually happen. You have to have low memory and the GC heap should be empty and only then you’ll get the actual issue. Low memory without the GC heap being full, and the code works as intended. GC heap being full and no low memory, no problemo.
I spoke with Jeffery Palermo about how we are building RavenDB itself, and it turns out good enough to make a couple of podcast episodes out of it.
You can listen to the first part here. And I would be delighted to hear your comments.
My keynote at the Progressive.NET conference is now live here.
You can now read the Inside RavenDB directly in your browser.
I’m really happy about this, not just because you can browse the full book online (or download to PDF) completely free. The main point is that now I can link directly to the specific part in the book where I’m discussing (in depth) certain features of RavenDB.
I think that this is going to make answering questions about RavenDB’s internal and behavior a lot easier and more approachable.
It also means, of course, that you can use Google to find information from the book.
I’m also currently working on updating the book for RavenDB 5.0. Although I’ll admit that in some cases I’m writing about features that haven’t yet seen the light of day.
Consider the following C code snippet:
This code cannot be written in C#. Why? Because you can’t use ‘+’ on bool, and you can’t cast bools. So I wrote this code, instead:
And then I changed it to be this code:
Can you tell why I did that? And what is the original code trying to do?
For that matter (and I’m honestly asking here), how would you write this code in C# to get the best performance?
Hint: