Ayende @ Rahien

Oren Eini aka Ayende Rahien CEO of Hibernating Rhinos LTD, which develops RavenDB, a NoSQL Open Source Document Database.

Get in touch with me:

oren@ravendb.net

+972 52-548-6969

Posts: 7,461 | Comments: 50,984

Privacy Policy Terms
filter by tags archive
time to read 7 min | 1203 words

I have been doing Open Source work for just under twenty years at this point. I have been paying my mortgage from Open Source software for about 15.  I’m stating that to explain that I have spent quite a lot of time struggling with the inherent tension between having an Open Source project and getting paid.

I wrote about it a few times in the past. It is not a trivial problem, and the core of the issue is not something that you can easily solve with technical means. I ran into this fascinating thread on Twitter that over the weekend:

And another part of that is here:

I’m quoting the most relevant pieces, but the idea is pretty simple.

Donations don’t work, period. They don’t work not because companies are evil or developers don’t want to pay for Open Source. They don’t work because it takes a huge amount of effort to actually get paid.

If you are an independent developer, your purchasing process goes something like this:

  1. I would like to use this thing
  2. I need to pay for that
  3. The price matches the value I’m getting
  4. Where is my credit card…
  5. Paid!

Did you note step 2? The part about needing to pay?

If you don’t have that step, what will happen? Same scenario, an independent developer:

  1. I would like to use this thing
  2. I use this thing
  3. It would be great to pay something to show my appreciation
  4. Where did I put the credit card? Oh, it’s down the hall… I’ll get to that later (never).

That is in the best-case scenario where the thought of donating actually crossed your mind. In most likelihood, the process is more:

  1. I would like to use this thing
  2. I use this thing
  3. Ticket closed, what is the next one… ?

Now, what happens if you are not an independent developer? Let’s say that you are a contract worker for a company. You need to talk to your contact person, they will need to get purchasing approval. Depending on the amount, that may require escalating upward a few levels, etc.

Let’s say that the amount is under 100$, so basically within the budgetary discretion of the first manager you run into. They would still need to know what they are paying for, what they are getting out of that (they need to justify that). If this is a donation, welcome to the beauty of tax codes in multiple jurisdictions and what counts as such. If this is not a donation, what do they get? That means that you now have to do a meeting, potentially multiple ones. Present your case, open a new supplier at the company, etc.

The cost of all of those is high, both in time and money. Or… you can just nuget add-package and move on.

In the case of RavenDB, it is an Open Source software (a license to match, code is freely available), but we treat it as a commercial project for all intents and purposes. If you want to install RavenDB, you’ll get a popup saying you need a license, directing you to a page where you see how much we would like to get and what do you get in return, etc. That means that from a commercial perspective, we are in a familiar ground for companies.  They are used to paying for software, and there isn’t an option to just move on to the next task.

There is another really important consideration here. In the ideal Open Source donation model, money just shows up in your account. In the commercial world, there is a huge amount of work that is required to get things done. That is when you have a model where “the software does not work without a purchase”.  To give some context, 22% is Sales & Marketing and they spent around 21.8 billion in 2022 on Sales & Marketing. That is literally billions being spent to make sales.

If you want to make money, you are going to invest in sales, sales strategy, etc. I’m ignoring marketing here because if you are expected to make money from Open Source, you likely already have a project well-known enough to at least get started.

That means that you need to figure out what you are charging for, how do you get customers, etc. In the case of RavenDB, we use the per-core model, which is a good indication of how much use the user is getting from RavenDB. LLBLGen Pro, on the other hand, they are charging per seat. Particular’s NServiceBus uses a per endpoint / number of messages a day model.

There is no one model that fits all. And you need to be able to tailor your pricing model to how your users think about your software.

So pricing strategy, creating a proper incentive to purchase (hard limit, usually) and some sales organization to actually drive all of that are absolutely required.

Notice what is missing here? GitHub. It simply has no role at all up to this point. So why the title of this post?

There is one really big problem with getting paid that GitHub can solve for Open Source (and in general, I guess).

The whole process of actually getting paid is absolutely atrocious. In the best case, you need to create a supplier at the customer, fill up various forms (no, we don’t use child labor or slaves, indeed), figure out all sorts of weird roles (German tax authority requires special dispensation, and let’s not talk about getting paid from India, etc). Welcome to Anti Money Laundering roles and GDPR compliance with Known Your Customer and SOC 2 regulations. The last sentence is basically nonsense words, but I understand that if you chant it long enough, you get money in the end.

What GitHub can do is be a payment pipe. Since presumably your organization is already set up with them in place, you can get them to do the invoicing, collecting the payment, etc. And in the end, you get the money.

That sounds exactly like GitHub Sponsorships, right? Except that in this case, this is no a donation. This is a flat-out simple transaction, with GitHub as the medium. The idea is that you have a limit, which you enforce, on your usage, and GitHub is how you are paid. The ability to do it in this fashion may make things easier, but I would assume that there are about three books worth of regulations and EULAs to go through to make it actually successful.

Yet, as far as I’m concerned, that is really the only important role that we have for GitHub here.

That is not a small thing, mind. But it isn’t a magic bullet.

time to read 8 min | 1514 words

Today I ran into this Reddit post, detailing how Moq is now using SponsorLink to encourage users to sponsor the project.

The idea is that if you are using the project, you’ll sponsor it for some amount, which funds the project. You’ll also get something like this:

lots of thanks from ThisAssembly

This has been rolled out for some projects for quite some time, it seems. But Moq is a far more popular project and it got quite a bit of attention.

It is an interesting scenario, and I gave some thought to what this means.

I’m not a user of Moq, just to note.

I absolutely understand the desire to be paid for Open Source work. It takes a lot of time and effort and looking at the amount of usage people get out of your code compared to the compensation is sometimes ridiculous.

For myself, I can tell you that I made 800 USD out of Rhino.Mocks directly when it was one of the most popular mocking frameworks in the .NET world. That isn’t a sale, that is the total amount of compensation that I got for it directly.

I literally cannot total the number of hours that I spent on it. But OpenHub estimates it as 245 man-years. I… disagree with that estimate, but I certainly put a lot of time there.

From a commercial perspective, I think that this direction is a mistake. Primarily because of the economies of software purchases. You can read about the implementation of SponsorLink here. The model basically says that it will check whether the individual user has sponsored the project.

That is… not really how it works. Let’s say that a new developer is starting to work on an existing project. It is using a SponsorLink project. What happens then? That new developer is being asked to sponsor the project?

If this is a commercial project, I certainly support the notion that there should be some payment. But it should not be on the individual developer, it should be on the company that pays for the project.

That leaves aside all the scenarios where this is being used for an open source project, etc. Let’s ignore those for now.

The problem is that this isn’t how you actually get paid for software. If you are targeting commercial usage, you should be targeting companies, not individual users. More to the point, let’s say that a developer wants to pay, and their company will compensate them for that.

The process for actually doing that is atrocious beyond belief. There are tax implications (if they sponsor with 5$ / month and their employer gives them a 5$ raise, that would be taxed, for example), so you need to submit a receipt for expenses, etc.

A far better model would be to have a way to get the company to pay for that, maybe on a per project basis. Then you can detect if the project is sponsored, for example, by looking at the repository URL (and accounting for forks).

Note that at this point, we are talking about the actual process of getting money, nothing else about this issue.

Now, let’s get to the reason that this caused much angst for people. The way SponsorLink works is that it fetches your email from the git configuration file and check wether:

  • You are registered as a SponsorLink sponsor
  • You are sponsoring this particular project

It does both checks using what appears to be: base62(sha256(email));

If you are already a SponsorLink sponsor, you have explicitly agreed to sharing your email, so not a problem there. So the second request is perfectly fine.

The real problem is the first check, when you check if you are a SponsorLink sponsor in the first place. Let’s assume that you aren’t, what happens then.

Well, there is a request made that looks something like this:

HEAD /azure-blob-storage/path/app/3uVutV7zDlwv2rwBwfOmm2RXngIwJLPeTO0qHPZQuxyS

The server will return a 404 if you are not a sponsor at this point.

The email hash above is my own, by the way. As I mentioned, I’m not a sponsor, so I assume this will return 404. The question is what sort of information is being provided to the server in this case?

Well, there is the hashed email, right? Is that a privacy concern?

It is indeed. While reversing SHA256 in general is not possible, for something like emails, that is pretty trivial. It took me a few minutes to find an online tool that does just that.

The cost is around 0.00045 USD / email, just to give some context. So the end result is that using SponsorLink will provide the email of the user (without their express or implied consent) to the server. It takes a little bit of extra work, but it most certainly does.

Note that practically speaking, this looks like it hits Azure Blob Storage, not a dedicated endpoint. That means that you can probably define logging to check for the requests and then gather the information from there.  Not sure what you would do with this information, but it certainly looks like this falls under PII definition on the GDPR.

There are a few ways to resolve this problem. The first would be to not use email at all, but instead the project repository URL. That may require a bit more work to resolve forks, but would alleviate most of the concerns regarding privacy. A better option would be to just check for an included file in the repository, to be honest. Something like: .sponsored.projects file.

That would include the ids of the projects that were sponsored by this project, and then you can run a check to see that they are actually sponsored. There is no issue with consent here, since adding that file to the repository will explicitly consent for the process.

Assuming that you want / need to use the emails still, the problem is much more complex. You cannot use the same process as k-Anonymity as you can use for passwords. The problem is that a SHA256 of an email is as good as the email itself.

I think that something like this would work, however. Given the SHA256 of the email, you send to the server the following details:

  • prefix = SHA256(email)[0 .. 6]
  • key = read(“/dev/urandom”, 32bytes)
  • hash = HMAC-SHA256(key, SHA256(email)

The prefix is the first 6 letters of the SHA256 hash. The key has cryptography strength of 32 random bytes.

The hash is taking the SHA256 and hashing it again usung HMAC with the provided key.

The idea is that on the server side, you can load all the hashes that you stored that match the provided prefix. Then you compute the keyed HMAC for all of those values and attempt to check if there is a match.

We are trying to protect against a malicious server here, remember. So the idea is that if there is a match, we pinged the server with an email that it knows about. If we ping the server with an email that it does not know about, on the other hand, it cannot tell you anything about the value.

The first 6 characters of the SHA256 will tell you nothing about the value, after all. And the fact that we use a random key to sending the actual hash to the server means that there is no point trying to figure it out.  Unlike trying to guess an email, guessing a hash of an email is likely far harder, to the point that it is not feasible.

Note, I’m not a cryptography expert, and I wouldn’t actually implement such a thing without consulting with one. I’m just writing a blog post with my ideas.

That would at least alleviate the privacy concern. But there are other issues.

The SponsorLink is provided as a closed-source obfuscated library. People have taken the time to de-obfuscate it, and so far it appears to be matching the documented behavior. But the mere fact that it is actually obfuscated and closed-source inclusion in an open-source project raises a lot of red flags.

Finally, there is the actual behavior when it detects that you are not sponsoring this project. Here is what the blog post states will happen:

A diagnostics warning in VS suggesting you install SponsorLink

It will delay the build (locally, on your machine, not on CI).

That… is really bad. I assume that this happens on every build (not sure, though). If that is the case, that means that the feedback cycle of "write a test, run it, write code, run a test", is going to hit significant slowdowns.

I would consider this to be a breaking point even excluding everything else.

As I previously stated, I’m all for paying for Open Source software. But this is not the way to do that, there is a whole bunch of friction and not much that can indicate a positive outcome for the project.

Monetization strategies for Open Source projects are complex. Open core, for example, likely would not work for this scenario. Nor would you be likely to get support contracts. The critical aspect is that beyond just the technical details, any such strategy requires a whole bunch of infrastructure around it. Marketing, sales, contract negotiation, etc. There is no easy solution here, I’m afraid.

time to read 1 min | 83 words

I’m going to QCon San Francisco and will be teaching a full day workshop where we’ll start from a C compiler and  an empty file and end up with a functional storage engine, indexing and more.

Included in the minimum requirements are implementing transactions, MVCC, persistent data structures, and indexes.

The workshop is going to be loosely based on the book, but I’m going to condense things so we can cover this topic in a single day.

Looking forward to seeing you there.

time to read 9 min | 1711 words

In my previous post I discussed how we could store the exact same information in several ways, leading to space savings of 66%! That leads to interesting questions with regard to actually making use of this technique in the real world.

The reason I posted about this topic is that we just gained a very significant reduction in memory (and we obviously care about reducing resource usage). The question is whether this is something that you want to do in general.

Let’s look at that in detail. For this technique to be useful, you should be using structs in the first place. That is… not quite true, actually. Let’s take a look at the following declarations:

We define the same shape twice. Once as a class and once as a structure. How does this look in memory?

Typelayoutfor'PersonClass'Size:32bytes.Paddings:2bytes%12ofObjectHeader8bytesMethodTablePtr8bytes1619:Int32Id4bytes2021:UInt16Kids2bytes2223:padding2bytes2432:DateTimeBirthday8bytesemptyspaceTypelayoutfor'PersonStruct'Size:24bytes.Paddings:10bytes%41of03:Int32Id4bytes47:padding4bytes815:DateTimeBirthday8bytes1617:UInt16Kids2bytes1823:padding6bytesemptyspace

Here you can find some really interesting differences. The struct is smaller than the class, but the amount of wasted space is much higher in the struct. What is the reason for that?

The class needs to carry 16 bytes of metadata. That is the object header and the pointer to the method table. You can read more about the topic here. So the memory overhead for a class is 16 bytes at a minimum. But look at the rest of it.

You can see that the layout in memory of the fields is different in the class versus the structure. C# is free to re-order the fields to reduce the padding and get better memory utilization for classes, but I would need [StructLayout(LayoutKind.Auto)] to do the same for structures.

The difference between the two options can be quite high, as you can imagine. Note that automatically laying out the fields in this manner means that you’re effectively declaring that the memory layout is an implementation detail. This means that you cannot persist it, send it to native code, etc. Basically, the internal layout may change at any time.  Classes in C# are obviously not meant for you to poke into their internals, and LayoutKind.Auto comes with an explicit warning about its behavior.

Interestingly enough, [StructLayout] will work on classes, you can use to force LayoutKind.Sequential on a class. That is by design, because you may need to pass a part of your class to unmanaged code, so you have the ability to control memory explicitly. (Did I mention that I love C#?)

Going back to the original question, why would you want to go into this trouble? As we just saw, if you are using classes (which you are likely to default to), you already benefit from the automatic layout of fields in memory. If you are using structs, you can enable LayoutKind.Auto to get the same behavior.

This technique is for the 1% of the cases where that is not sufficient, when you can see that your memory usage is high and you can benefit greatly from manually doing something about it.

That leads to the follow-up question, if we go about implementing this, what is the overhead over time? If I want to add a new field to an optimized struct, I need to be able to understand how it is laid out in memory, etc.

Like any optimization, you need to maintain that. Here is a recent example from RavenDB.

image

In this case, we used to have an optimization that had a meaningful impact. The .NET code changed, and the optimization now no longer makes sense, so we reverted that to get even better perf.

At those levels, you don’t get to rest on your laurels. You have to keep checking your assumptions.

If you got to the point where you are manually optimizing memory layouts for better performance, there are two options:

  • You are doing that for fun, no meaningful impact on your system over time if this degrades.
  • There is an actual need for this, so you’ll need to invest the effort in regular maintenance.

You can make that easier by adding tests to verify those assumptions. For example, verifying the amount of padding in structs match expectation. A simple test that would verify the size of a struct would mean that any changes to that are explicit. You’ll need to modify the test as well, and presumably that is easier to catch / review / figure out than just adding a field and not noticing the impact.

In short, this isn’t a generally applicable pattern. This is a technique that is meant to be applied in case of true need, where you’ll happily accept the additional maintenance overhead for better performance and reduced resource usage.

time to read 35 min | 6841 words

Consider a warehouse that needs to keep track of items. For the purpose of discussion, we have quite a few fields that we need to keep track of. Here is how this looks like in code:

And the actual Warehouse class looks like this:

The idea is that this is simply a wrapper to the list of items. We use a struct to make sure that we have good locality, etc.

The question is, what is the cost of this? Let’s say that we have a million items in the warehouse. That would be over 137MB of memory. In fact, a single struct instance is going to consume a total of 144 bytes.

That is… a big struct, I have to admit. Using ObjectLayoutInspector I was able to get the details on what exactly is going on:

Type layout for 'WarehouseItem'
    Size: 144 bytes. Paddings: 62 bytes (%43 of empty space)
    07:Int64ticks8bytes07:UInt64dateData8bytes07:UInt64dateData8bytes07:UInt64dateData8bytes015:Nullable`1ProductDimensions16bytes0:BooleanhasValue1byte13:padding3bytes415:Dimensionsvalue12bytes03:SingleLength4bytes47:SingleWidth4bytes811:SingleHeight4bytes1631:Nullable`1ExternalSku16bytes0:BooleanhasValue1byte17:padding7bytes815:Int64value8bytes3247:Nullable`1ShelfLife16bytes0:BooleanhasValue1byte17:padding7bytes815:TimeSpanvalue8bytes4855:Nullable`1AlcoholContent8bytes0:BooleanhasValue1byte13:padding3bytes47:Singlevalue4bytes5671:Nullable`1ProductionDate16bytes0:BooleanhasValue1byte17:padding7bytes815:DateTimevalue8bytes7279:Nullable`1RgbColor8bytes0:BooleanhasValue1byte13:padding3bytes47:Int32value4bytes8081:Nullable`1IsHazardous2bytes0:BooleanhasValue1byte1:Booleanvalue1byte8283:padding2bytes8491:Nullable`1Weight8bytes0:BooleanhasValue1byte13:padding3bytes47:Singlevalue4bytes9299:Nullable`1Quantity8bytes0:BooleanhasValue1byte13:padding3bytes47:Int32value4bytes100103:padding4bytes104119:Nullable`1ArrivalDate16bytes0:BooleanhasValue1byte17:padding7bytes815:DateTimevalue8bytes120121:Nullable`1Fragile2bytes0:BooleanhasValue1byte1:Booleanvalue1byte122127:padding6bytes128143:Nullable`1LastStockCheckDate16bytes0:BooleanhasValue1byte17:padding7bytes815:DateTimevalue8bytes

As you can see, there is a huge amount of wasted space here. Most of which is because of the nullability. That injects an additional byte, and padding and layout issues really explode the size of the struct.

Here is an alternative layout, which conveys the same information, much more compactly. The idea is that instead of having a full byte for each nullable field (with the impact on padding, etc), we’ll have a single bitmap for all nullable fields. Here is how this looks like:

If we look deeper into this, we’ll see that this saved a lot, the struct size is now 96 bytes in size. It’s a massive space-savings, but…

Type layout for 'WarehouseItem'
Size: 96 bytes. Paddings: 24 bytes (%25 of empty space)

We still have a lot of wasted space. This is because we haven’t organized the struct to eliminate padding. Let’s reorganize the structs fields to see what we can achieve. The only change I did was re-arrange the fields, and we have:

And the struct layout is now:

Typelayoutfor'WarehouseItem'Size:72bytes.Paddings:0bytes%0ofemptyspace011:DimensionsProductDimensions12bytes03:SingleLength4bytes47:SingleWidth4bytes811:SingleHeight4bytes1215:SingleAlcoholContent4bytes1623:Int64ExternalSku8bytes2431:TimeSpanShelfLife8bytes3239:DateTimeProductionDate8bytes4047:DateTimeArrivalDate8bytes4855:DateTimeLastStockCheckDate8bytes5659:SingleWeight4bytes6063:Int32Quantity4bytes6467:Int32RgbColor4bytes68:BooleanFragile1byte69:BooleanIsHazardous1byte7071:UInt16nullability2bytes

We have no wasted space, and we are 50% of the previous size.

We can actually do better, note that Fragile and IsHazarous are Booleans, and we have some free bits on _nullability that we can repurpose.

For that matter, RgbColor only needs 24 bits, not 32. Do we need alcohol content to be a float, or can we use a byte? If that is the case, can we shove both of them together into the same 4 bytes?

For dates, can we use DateOnly instead of DateTime? What about ShelfLife, can we measure that in hours and use a short for that (giving us a maximum of 7 years)?

After all of that, we end up with the following structure:

And with the following layout:

03:Int32dayNumber4bytes03:Int32dayNumber4bytes03:Int32dayNumber4bytesTypelayoutfor'WarehouseItem'Size:48bytes.Paddings:0bytes%0ofemptyspace011:DimensionsProductDimensions12bytes03:SingleLength4bytes47:SingleWidth4bytes811:SingleHeight4bytes1215:SingleWeight4bytes1623:Int64ExternalSku8bytes2427:DateOnlyProductionDate4bytes2831:DateOnlyArrivalDate4bytes3235:DateOnlyLastStockCheckDate4bytes3639:Int32Quantity4bytes4043:Int32rgbColorAndAlcoholContentBacking4bytes4445:UInt16nullability2bytes4647:UInt16ShelfLifeInHours2bytes

In other words, we are now packing everything into  48 bytes, which means that we are one-third of the initial cost. Still representing the same data. Our previous Warehouse class? It used to take 137MB for a million items, it would now take 45.7 MB only.

In RavenDB’s case, we had the following:

That is the backing store of the dictionary, and as you can see, it isn’t a nice one. Using similar techniques we are able to massively reduce the amount of storage that is required to process indexing.

Here is what this same scenario looks like now:

But we aren’t done yet , there is still more that we can do.

time to read 2 min | 369 words

RavenDB is a .NET application, written in C#. It also has a non trivial amount of unmanaged memory usage. We absolutely need that to get the proper level of performance that we require.

With managing memory manually, there is also the possibility that we’ll mess it up. We run into one such case, when running our full test suite (over 10,000 tests) we would get random crashes due to heap corruption. Those issues are nasty, because there is a big separation between the root cause and the actual problem manifesting.

I recently learned that you can use the gflags tool on .NET executables. We were able to narrow the problem to a single scenario, but we still had no idea where the problem really occurred. So I installed the Debugging Tools for Windows and then executed:

 &"C:\Program Files (x86)\Windows Kits\10\Debuggers\x64\gflags.exe" /p /enable C:\Work\ravendb-6.0\test\Tryouts\bin\release\net7.0\Tryouts.exe

What this does is enable a special debug heap at the executable level, which applies to all operations (managed and native memory alike).

With that enabled, I ran the scenario in question:

PS C:\Work\ravendb-6.0\test\Tryouts>  C:\Work\ravendb-6.0\test\Tryouts\bin\release\net7.0\Tryouts.exe
42896
Starting to run 0
Max number of concurrent tests is: 16
Ignore request for setting processor affinity. Requested cores: 3. Number of cores on the machine: 32.
         To attach debugger to test process (x64), use proc-id: 42896. Url http://127.0.0.1:51595
Ignore request for setting processor affinity. Requested cores: 3. Number of cores on the machine: 32.  License limits: A: 3/32. Total utilized cores: 3. Max licensed cores: 1024
http://127.0.0.1:51595/studio/index.html#databases/documents?&database=Should_correctly_reduce_after_updating_all_documents_1&withStop=true&disableAnalytics=true
Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
    at Sparrow.Server.Compression.Encoder3Gram`1[[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].Encode(System.ReadOnlySpan`1<Byte>, System.Span`1<Byte>)
    at Sparrow.Server.Compression.HopeEncoder`1[[Sparrow.Server.Compression.Encoder3Gram`1[[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], Sparrow.Server, Version=6.0.0.0, Culture=neutral, PublicKeyToken=37f41c7f99471593]].Encode(System.ReadOnlySpan`1<Byte> ByRef, System.Span`1<Byte> ByRef)
    at Voron.Data.CompactTrees.PersistentDictionary.ReplaceIfBetter[[Raven.Server.Documents.Indexes.Persistence.Corax.CoraxDocumentTrainEnumerator, Raven.Server, Version=6.0.0.0, Culture=neutral, PublicKeyToken=37f41c7f99471593],[Raven.Server.Documents.Indexes.Persistence.Corax.CoraxDocumentTrainEnumerator, Raven.Server, Version=6.0.0.0, Culture=neutral, PublicKeyToken=37f41c7f99471593]](Voron.Impl.LowLevelTransaction, Raven.Server.Documents.Indexes.Persistence.Corax.CoraxDocumentTrainEnumerator, Raven.Server.Documents.Indexes.Persistence.Corax.CoraxDocumentTrainEnumerator, Voron.Data.CompactTrees.PersistentDictionary)
    at Raven.Server.Documents.Indexes.Persistence.Corax.CoraxIndexPersistence.Initialize(Voron.StorageEnvironment)

That pinpointed things so I was able to know exactly where we are messing up.

I was also able to reproduce the behavior on the debugger:

image (3)

This saved me hours or days of trying to figure out where the problem actually is.

time to read 3 min | 541 words

We got a support call from a client, in the early hours of the morning, they were getting out-of-memory errors from their database and were understandably perturbed by that. They are running on a cloud system, so the first inclination of the admin when seeing the problem was deploying the server on a bigger instance, to at least get things running while they investigate. Doubling and then quadrupling the amount of memory that the system has had no impact. A few minutes after the system booted, it would raise an error about running out of memory.

Except that it wasn’t actually running out of memory. A scenario like that, when we give more memory to the system and still have out-of-memory errors can indicate a leak or unbounded process of some kind. That wasn’t the case here. In all system configurations (including the original one), there was plenty of additional memory in the system. Something else was going on.

When our support engineer looked at the actual details of the problem, it was quite puzzling. It looked something like this:

System.OutOfMemoryException: ENOMEM on Failed to munmap at Sparrow.Server.Platform.Posix.Syscall.munmap(IntPtr start, UIntPtr length);

That error made absolutely no sense, as you can imagine. We are trying to release memory, not allocate it. Common sense says that you can’t really fail when you are freeing memory. After all, how can you run out of memory? I’m trying to give you some, damn it!

It turns out that this model is too simplistic. You can actually run out of memory when trying to release it. The issue is that it isn’t you that is running out of memory, but the kernel. Here we are talking specifically about the Linux kernel, and how it works.

Obviously a very important aspect of the job of the kernel is managing the system memory, and to do that, the kernel itself needs memory. For managing the system memory, the kernel uses something called VMA (virtual memory area). Each VMA has its own permissions and attributes. In general, you never need to be aware of this detail.

However, there are certain pathological cases, where you need to set up different permissions and behaviors on a lot of memory areas. In the case we ran into, RavenDB was using an encrypted database. When running on an encrypted database, RavenDB ensures that all plain text data is written to memory that is locked (cannot be stored on disk / swapped out).

A side effect of that is that this means that for every piece of memory that we lock, the kernel needs to create its own VMA. Since each of them is operated on independently of the others. The kernel is using VMAs to manage its own map of the memory. and eventually, the number of the items in the map exceeds the configured value.

In this case, the munmap call released a portion of the memory back, which means that the kernel needs to split the VMA to separate pieces. But the number of items is limited, this is controlled by the vm.max_map_count value.

The default is typically 65530, but database systems often require a lot more of those. The default value is conservative, mind.

Adjusting the configuration would alleviate this problem, since that will give us sufficient space to operate normally.

time to read 4 min | 631 words

On its face, we have a simple requirement:

  • Generate sequential numbers
  • Ensure that there can be no gaps
  • Do that in a distributed manner

Generating the next number in the sequence is literally as simple as ++, so surely that is a trivial task, right?

The problem is with the second requirement. The need to ensure that there are no gaps comes often when dealing with things like invoices. The tax authorities are really keen on “show me all your invoices”, and if there are gaps in the numbers, you may have to provide Good Answers.

You may think that the third one, running in a distributed environment, is the tough challenge, but that isn’t actually the case. If we are running in a single location, that is fairly easy. Run the invoice id generation as a transaction, and you are done. But the normal methods of doing that are usually wrong in edge cases.

Let’s assume that we use an Oracle database, which uses the following mechanism to generate the new invoice id:

invoice_seq.NEXTVAL

Or SQL Server with an identity column:

CREATE TABLE invoices ( invoice_id INT IDENTITY(1,1) PRIMARY KEY, ... );

In both cases, we may insert a new value to the invoices table, consuming an invoice id. At some later point in time, we may roll back the transaction. Care to guess what happens then?

You have INVOICE #1000 and INVOICE #1002, but nothing in between. In fact, no way to even tell what happened, usually.

In other words, identity, sequence, serial, or autonumber – regardless of what database platform you use, are not suitable for generating gapless numbers.

The reasoning is quite simple. Assume that you have two concurrent transactions, which generate two new invoices at roughly the same time. You commit the later one before the first one, and roll back the first. You now have:

  • Invoice #1
  • Invoice #2
  • Invoice #1000
  • Invoice #1002

However, you don’t have Invoice #1001, and you cannot roll back the sequence value there, because if you do so, it will re-generate the #1002 on the next call.

Instead, for gapless numbers, we need to create this as a dedicated part of the transaction. So there would be a record in our system that contains the NextInvoiceId, which will be incremented as part of the new invoice creation.

In order to ensure that there are no gaps, you need to ensure that the NextInvoideId record increment is handled as a user operation, not a database operation. In other words, in SQL Server, that is a row in a table, that you manually increment as part of adding a new invoice. Here is what this will look like:

As you can see, we increment the row directly. So it will be rolledback as well.

The downside here is that we can no longer create two invoices concurrently. The second transaction would have to wait on the lock on the row in the next_gapless_ids table.

All of that happens inside a single database server. What happens when we are running in a distributed environment?

The answer in this case, is the exact same thing. You need a transaction, a distributed one, using a consensus algorithm. Here is how you can achieve this using RavenDB’s cluster wide transactions, which use the Raft protocol behind the scenes:

The idea is simple, we have a transaction that modifies the gapless ids document and creates a new invoice at the same time. We have to handle a concurrency exception if two transactions try to create a new invoice at the same time (because they both want to use the same invoice id value), but in essence this is pretty much exactly the same behavior as when you are running on a single node.

In other words, to ensure the right behavior, you need to use a transaction. And if you need a distributed transaction, that is just a flag away with RavenDB.

time to read 3 min | 490 words

A long while ago I had this project, which allows you to define software licenses that you can distribute. The basic idea is pretty simple, we want to be able to define a key (needs to be short and copy/pastable) that we’ll be able to provide to our code that is running in a separate environment. In particular, we have to deal with the issue of not being connected to a central server.

That sounds like a really hard problem, but it turns out that it is a pretty simple solution, if we use public key cryptography. Typically you’ll utilize public key cryptography for encryption, but you can also use that for signing. The idea is that we can use the ability to sign the key using a private key, then validate it (potentially offline) using the public key.

Licensing can be complex, so we are going to punt all of that to someone else. In this post I’m going to just discuss how we can sign a piece of data and then verify it. We’ll start by generating the keys, this is an action that you should only do once:

Here is the output of this code:

You can now embed the public key in your software, while keeping the private key hidden.

With that in place, we can now sign a license. But what is a license? At its most basic form, it is a set of attributes that describe what your license allows. As such, we can use a Dictionary<string,string> to define the capabilities of the license.

With that in place, we can write very simple code to generate the license text:

And here is the result:

The last part to deal with is verifying that the license is valid, we can do that using the following function:

If the license options are the same as the one we signed and the cryptographic signature is a match, we can safely return the license options. Otherwise, we’ll return null.

As I said, this is pretty simple code all around. And doesn’t do much, but it means that you can let the framework carry most of the load.

Features such as enabling specific capabilities for a license, expiration, etc are all possible. Define an ExpirationDate property and have your software react to that, for example.

A few words about this implementation, however. I’m relying heavily on the .NET framework, obviously. But beyond just using the cryptographic primitives, we also have to take into account a bunch of other aspects.

For example, I’m not bothering to normalize the license. Instead, I rely on the fact that the .NET Dictionary will iterate over keys in insertion order. Note that any change in the actual data will result in a verification fails.

This is a pretty simple design, but it ensures that you cannot “crack” the algorithm used to generate the license keys. Of course, users can still always just patch the isValidLicense function, instead Smile.

time to read 2 min | 356 words

After this saga, I wanted to close the series with some numbers about the impact of this algorithm.

If you’ll recall, I started this whole series discussing variable-sized integers. I was using this list of numbers to compare the values. There are 44,956 items in the list.

Algorithm Size
Raw 359.648
Varint 224,780
Delta + Varint    45,970
Delta + Varint + Gzip             5,273
FastPFor      24,717

You can see some interesting details about the various options. Delta + Varint + Gzip is a really good setup, if you can assume that the output pattern has a high degree of repetition. In some cases, that is possible, but that isn’t a generic property. Aside from that, FastPFor is winning hands down in terms of the output size of the data.

There is also another important aspect. The size of the output here is too big to fit in a single 8KB buffer, so I’ll need to use multiple for that. This is not something that you can really do with GZip. Here is the cost across multiple buffers:

This particular list would fit into 4 pages:

  • 14,080 entries at 8,048 bytes
  • 14,080 entries at 8,063 bytes
  • 15,616 entries at 8,030 bytes
  •   1,180 entries at    644 bytes

But the compression ratio is only part of that. Let’s talk about the performance aspect. On my machine, you can run the encoding process in 1,115 ticks and decoding in just 462 ticks.

To give some context, that means that you can do the encoding at a rate of ~400,000,000 / second and decoding at a rate of about 1 billion per second.

My perf team also asked me to mention that they haven’t gotten the chance to look at the code yet, so things are likely to improve.

The entire premise of FastPFor inside of RavenDB relies on these fast decoding & encoding times. It makes it super cheap to iterate over a vast amount of numbers, and the B+Tree layout we have means that updating a posting list’s page is trivial. Read it, merge, and encode it back. It’s fast enough that there is really no other place for meaningful optimizations / complexity.

FUTURE POSTS

No future posts left, oh my!

RECENT SERIES

  1. Challenge (72):
    19 Sep 2023 - Spot the bug
  2. Filtering negative numbers, fast (4):
    15 Sep 2023 - Beating memcpy()
  3. Recording (9):
    28 Aug 2023 - RavenDB and High Performance with Oren Eini
  4. Production postmortem (50):
    24 Jul 2023 - The dog ate my request
  5. Podcast (4):
    21 Jul 2023 - Hansleminutes - All the Performance with RavenDB's Oren Eini
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}