Ayende @ Rahien

Hi!
My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by email or phone:

ayende@ayende.com

+972 52-548-6969

, @ Q c

Posts: 6,369 | Comments: 47,287

filter by tags archive

RavenDB 4.0Managing encrypted databases

time to read 6 min | 1012 words

imageOn the right you can see how the new database creation dialog looks like, when you want to create a new encrypted database. I talked about the actual implementation of full database encryption before, but todays post’s focus is different.

I want to talk a out managing encrypted databases. As an admin tasked working with encrypted data, I need to not only manage the data in the database itself, but I also need to handle a lot more failure points when using encryption. The most obvious of them is that if you have an encrypted database in the first place, then the data you are protecting is very likely to be sensitive in nature.

That raise the immediate question of who can see that information. For that matter, are you allowed to see that information? RavenDB 4.0 has support for time limited credentials, so you register to get credentials in the system, and using whatever workflow you have the system generate a temporary API key for you that will be valid for a short time and then expire.

What about all the other things that an admin needs to do? The most obvious example is how do you handle backups, either routine or emergency ones. It is pretty obvious that if the database is encrypted, we also want the backups to be encrypted, but are they going to use the same key? How do you restore? What about moving the database from one machine to the other?

In the end, it all hangs on the notion of keys. When you create a new encrypted database, we’ll generate a key for you, and require that you confirm for us that you have persisted that information in some manner. You can print it, download it, etc. And you can see the key right there in plain text during the db creation. However, this is the last time that the database key will ever reside in plain text.

So what about this QR code, what is it doing there? Put simply, it is there to capture attention. It replicates the same information that you have in the key itself, obviously. But what for?

The idea is that users are often hurrying through the process, (the “Yes, dear!” mode) and we want to encourage them to stop of a second and think. The use of the QR code make it also much more likely that the admin will print and save the key in an offline manner, which is likely to be safer than most methods.

So this is how we encourage administrators to safely remember the encryption key. This is useful because that give the admin the ability to take a snapshot on one machine, and then recover it on another, where the encryption key is not available, or just move the hard disk between machines if the old one failed. It is actually quite common in cloud scenarios to have a machine that has an attached cloud storage, and if the machine fails, you just spin up a new machine and attach the storage to the new one.

We keep the encryption keys secret by utilizing system specific keys (either DPAPI or decryption key that only the specific user can access), so moving machines like that will require the admin to provide the encryption key so we can continue working.

The issue of backups is different. It is very common to have to store long term backups, and needing to restore them in a separate location / situation. At that point, we need the backup to be encrypted, but we don’t want it it use the same encryption key as the database itself. This is mostly about managing keys. If I’m managing multiple databases, I don’t want to record the encryption key for each as part of the restore process. That is opening us to a missed key and a useless backup that we can do nothing about.

Instead, when you setup backups (for encrypted databases it is mandatory, for regular databases, optional) we’ll give you the option to provide a public key that we’ll then use to encrypted the backup. That means that you can more safely store it in cloud scenarios, and regardless of how many databases you have, as long as you have the private key, you’ll be able to restore the backup.

Finally, we have one last topic to cover with regards to encryption, the overall system configuration. Each database can be encrypted, sure, but the management of the database (such as connection strings that it replicates to, API keys that it uses to store backups and a lot of other potentially sensitive information) is still stored in plain text. For that matter, even if the database shouldn’t be encrypted, you might still want to encrypted the full system configuration. That lead to somewhat of a chicken and egg problem.

On the one hand, we can’t encrypt the server configuration from the get go, because then the admin will not know what the key is, and they might need that if they need to move machines, etc. But once we started, we are using the server configuration, so we can’t just encrypt that on the fly. What we ended up using is a set of command line parameters, so if the admins wants to run encrypted server configuration, they can stop the server, run a command to encrypt the server configuration and setup the appropriate encryption key retrieval process (DPAPI, for example, under the right user).

That gives us the chance to make the user aware of the key and allow to save it in a secure location. All members in a cluster with an encrypted server configuration must also have encrypted server configuration, which prevents accidental leaks.

I think that this is about it, with regards to the operations details of managing encryption, Smile. Pretty sure that I missed something, but this post is getting long as it is.

RavenDB 4.0Attachments

time to read 2 min | 251 words

image

I previously wrote about the new attachments feature in RavenDB 4.0. Now it is ready to be seen by outside eyes.

As you can see in the image on the right, documents can now have attached attachments (I’m sorry, couldn’t think about a better way to phrase this). This give you the ability to store binary data inside RavenDB, but not as some free floating value that has only very loose connection to the rest of the system. Those attachments are strongly tied to their parent document, and allow you to store related information easily and right next to the document.

 

 

 

 

That also means that you can take advantage of RavenDB’s ACID nature and actually make modifications to both attachments and documents at the same time. For example, consider the following code:

Here we get the user’s profile picture, generate a thumbnail from it and then associate both picture and thumbnail with the document, we are also updating the status of the user to indicate that they have a profile picture and then submit it all as one single transaction. That means that you don’t have to sync between different sources.

Attachments in RavenDB are also kept consistent through replication, so you won’t see partial results between nodes, and the attachments themselves are using de-duplication techniques to reduce the amount of storage that we take.

I’m really happy with this feature.

Storing secrets in Linux

time to read 3 min | 545 words

We need to store an encryption key on Linux & Windows. On Windows, the decision is pretty much trivial, you throw that into DPAPI, and can rely on the operating system to handle that for us. In particular, it is very easy to analyze key scenarios such as “someone stole the hard disk” and say that either the thief wouldn’t be able to get the plain text key, or we can blame Microsoft for that Smile.

On Linux, the situation seems to be much more chaotic. There is libsecret, which seems to be much wider in scope than DPAPI. Whereas DPAPI has 2 methods (protect & unprotect), libsecret has a lot of moving pieces, which is quite scary. That is leaving aside the issue of having no managed implementation and having to dance around Gnome specific data types in the API (need to pass GCancellable & GError into it) which increase the complexity.

Other options include using some sort of hardware / software security modules (such as HashiCorp Vault), which is great in theory, but requires us to either take a dependency on something that might not be there, or try to support a wide variety of options (Keywhiz, Chef, Puppet, CloudHSM, etc). That isn’t a really good alternative from our point of view.

Looking into how Mono implemented the DPAPI on Linux, they did it by writing a master key to an XML file and relied on file system ACLs to prevent anyone from seeing that information. This end up being this:

chmod(path, (S_IRUSR | S_IWUSR | S_IXUSR) );

Which has the benefit of only allowing that user to access it, but given that I’ve gotten the physical disk, I’m able to easily mount that on machine that I control as root and access anything that I like. On Windows, by the way, the way this is prevented is that the user must have logged in, and a key that is derived from their password is used to decrypt all protected data as needed, so without the user logging in, you cannot decrypt anything. For that matter, even the administrator on the machine can’t recover the data if they want to, because resetting the user’s password will cause all such information to be lost.

There is the Gnome.Keyring project as well, which hasn’t been updated in 7 years, and obviously doesn’t support the kwallet (which libsecret does). OWASP seems to be throwing the towel there and just recommend to rely on the file system ACL.

The Linux Kernel has a Key Retention API, but it seems to be targeted primarily toward giving file systems access to the secrets they need, and it looks like it isn’t going to survive reboots (it is primarily a caching mechanism, it looks like?).

So after all this research, I can say that I don’t like libsecret, it seems too cumbersome and will need users to jump through some hoops in some cases (install additional packages, setup access, etc). 

Setting up the permissions via the ACL seems to be the common way to handle this scenario, but it doesn’t sit well with me.

Any recommendations?

Tripping on the TPL

time to read 2 min | 364 words

I thought that I found a bug in the TPL, but it looks like its working (more or less) by design. Basically, when a task is completed, all awaiting tasks will be notified on that, which is pretty much what you would expect. What isn’t usually expected is that those tasks can interfere with one another. Consider the following code:

We have two tasks, which accept a parent task and do something with it. What do you think will happen when we run the following code?

Unless you are very quick on the draw, running this code will result in a timeout message, but how? We know that we have a much shorter duration for the task than the timeout, so what is going on?

Well, effectively what is going on is that the parent task has a list of children that it will notify, and by default, it will do so synchronously and sequentially. If a child task blocks for whatever reason (for example, it might be processing a lot of work), the other children of the parent task will not be notified.

If there is a timeout setup, it will be triggered, even though the parent task was already completed. It took us a lot of time to figure out the repro in this issue, and we were certain that this is some sort of race condition in the TPL. I had a blog post talking all about it, but the Microsoft team is fast enough that they were able to literally answer my issue before I had the time to complete my blog post. That is really impressive.

I should note that the suggestion, using RunContinuationsAsynchronously, works quite well for creating a new Task or using TaskCompletionSource, but there is no way to specify that when you are using Task.Run. What is worse for us is that since this is not the default (for perfectly good performance reasons, by the way), this means that any code that we call into might trigger this. I would have much rather to be able to specify than when waiting on the task, rather than when creating it.

Distributed task assignment with failover

time to read 5 min | 818 words

When building a distributed system, one of the more interesting aspects is how you are going to distribute tasks assignment. In other words, given that you have multiple nodes, how do you decide which node will do what? In some cases, that is relatively easy, you can say “all nodes will process read requests”, but in others, this is more complex. Let us take the case where you have several nodes, and you need to have a regular backup of a database that is replicated between all those nodes. You probably don’t want to run the backup across all the nodes, after all, they are pretty much the same and you don’t want to backup the exact same thing multiple times. On the other hand, you probably don’t want to assign this work statically, if you do, and if the node that is responsible for the backup is down, you got no backup.

Another example of the problem can be seen when you have other processes that you would like to be sticky if possible, and only jump around if there is a failure. Brining up a new node online is a common thing to do in a cluster, and the ideal scenario in that case is that a single node will feed it all the data that it needs. If we have multiple nodes doing that, they are likely to overlap and they might very well overload the poor new server. So we want just one node to update its state, but if that node goes down midway, we need someone else to pick up the slack.. For RavenDB, those sort of tasks includes things like ETL processes, Subscriptions, backup, bootstrapping new servers and more. We keep discovering new things that can use this sort of behavior.

But how do we actually make this work?

One way of doing this is to take advantage of the fact that RavenDB is using Raft and have the leader do task assignment. That works quite well, but it doesn’t scale. What do I meant by that? I mean that as the number of tasks that we need to manage grows, the complexity in the task assignment portion of the code grows as well. Ideally, I don’t want to have to consider twenty different variables and considerations before deciding what operation should go on which server, and trying to balance that sort of work in one place has proven to be challenging.

Instead, we have decided to allocate work in the cluster based on simple rules. Each task in the cluster has a key (which is typically generated by hashing some of its parameters), and that task can be assigned to a group of servers. Given those two details, we can use Jump Consistent Hashing to spread the load around. However, that doesn’t handle failover. We have a heartbeat process that can detect and notify nodes that a sibling has went down, so combining those two, we get the following code:

What we are doing here is rely on two different properties. Jump Consistent Hashing to let us know which node is responsible for what, and the Raft cluster leader that keep track of all the live nodes and let us know when a node goes down. When we need to assign a task, we use its hashed key to find its preferred owner, and if it is alive, that is that. But if it currently down, we do two things, we remove the downed node from the topology and re-hash the key with the new number of nodes in the cluster. That gives us a new preferred node, and so on until we find a live one.

The reason we rehash on failover is that Jump Consistent Hashing is going to usually point to the same position in the topology (that is why we choose it in the first place, after all), so we rehash to get a different position so it won’t all fall unto the next node in the list. All downed node tasks are fairly distributed among the remaining live cluster members.

The nice thing about this is that aside from keeping the live/down list up to date, the Raft cluster doesn’t really need to do something. This is a consistent algorithm, so different nodes operating on the same data can arrive at the same result, so a node going down will result in another node picking up on updating the new server up to spec and another will start a backup process. And all of that logic is right where we want it, right next to where the task logic itself is written.

This allow us to reason much more effectively about the behavior of each independent task, and also allow each node to let you know where each task is executing.

I was wrong, reflecting on the .NET design choices

time to read 3 min | 480 words

I have been re-thinking about some of my previous positions with regards to development, and it appear that I have been quite wrong in the past.

In particular, I’m talking about things like:

Note that those posts are parts of a much larger discussion, and both are close to a decade old. They aren’t really relevant anymore, I think, but it still bugs me, and I wanted to outline my current thinking on the matter.

C# is non virtual by default, while Java is virtual by default. That seems like a minor distinction, but it has huge implications. It means that proxying / mocking / runtime subclassing is a lot easier with Java than with C#. In fact, a lot of frameworks that were ported from Java rely on this heavily, and that made it much harder to use them in C#. The most common one being NHibernate, and one of the chief frustrations that I kept running into.

However, given that I’m working on a database engine now, not on business software, I can see a whole different world of constraints. In particular, a virtual method call is significantly more expensive than a direct call, and that adds up quite quickly. One of the things that we routinely do is try to de-virtualize method calls using various tricks, and we are eagerly waiting .NET Core 2.0 with the de-virtualization support in the JIT (we already start writing code to take advantage of it).

Another issue is that my approach to software design has significantly changed. Where I would previously do a lot of inheritance and explicit design patterns, I’m far more motivated toward using composition, instead. I’m also marking very clear boundaries between My Code and Client Code. In My Code, I don’t try to maintain encapsulation, or hide state, whereas with stuff that is expected to be used externally, that is very much the case. But that give a very different feel to the API and usage patterns that we handle.

This also relates to abstract class vs interfaces, and why you should care. As a consumer, unless you are busy doling some mocking or so such, you likely don’t, but as a library author, that matters a lot to the amount of flexibility you get.

I think that a lot of this has to do with my view point, not just as an Open Source author, but someone who runs a project where customers are using us for years on end, and they really don’t want us to make any changes that would impact their code. That lead to a lot more emphasis on backward compact (source, binary & behavior), and if you mess it up, you get ricochets from people who pay you money because their job is harder.

Emoji Encoding: A new style for binary encoding for the web

time to read 4 min | 604 words

Computers think in binary, and you would have thought that sending binary data around would be pretty easy. But that turns out to be a completely non trivial task. The problem is those pesky humans and needing to interface with them.

For example, if I need to send some binary data over email, I can either do that as attachment, with high probability of at least a few people never getting it, or I can encode it somehow. Typical choices are Base64 encoding for the low tech and barcodes / QR code and the like. For the fancy among us, we can try go with Base85 and other such things. That is pretty standard, but it really has a lot of limitations. Base64 will increase the size of the data by 25%, and it is case sensitive, so it is hard to get right if you need to actually look at it and not just copy/paste it. It is also limited to plain old ASCII, for compatibility reasons that don’t make a lot of sense in today’s world.

I have been thinking about this for a long time, because we need to send binary data (license information) in text, and we also need that to look well and formatted.

After a lot of thought and experimentation, I’m proud to announce a new form of encoding: the Emoji Encoder, available currently for .NET, but soon to be available for Ruby, Python, Go, Node.JS, Ember.js, React.JS and maybe jQuery.

The idea for this innovation came to me because of the following observations:

  • Emojis are becoming much more important in any textual conversation (to the point where people will say an emoji). That mean that we can rely on them for long term, which is very important for storage technology.
  • Trying to read meaning from emojis being sent is clearly impossible, as anyone taking a peek at a text conversation between two teenage girls can say. (Although they appear to have a hidden meaning, if she sent the red heel and not the blue heel emoji that apparently means something.)
  • Because emojis are so relevant, they can be sent anywhere a normal text would go, including email, social media, printing, etc.
  • There are a lot of emojis, allowing us to overcome the bloat of Base64 and its friends by dedicating a single emoji for each byte in a 1:1: mapping.

That means that in terms of characters, Emoji Encoding is a net win. Consider the following equivalent information:

  • I5xy4dT9Qyjp7DKwuVI6y95EwlDeO/NBeiuc3GJ5Mjo= <—45 characters
  • ℹ⤴⚫✔⭕㊗◀☔➖✂♥⛵✖♍❤⛵✅✏ℹ⛲✂ <—33 characters

That is quite important when dealing with constrained textual formats, such as twitter, where the above will be rendered as:

There are other advantages. This data is actually a 256 bits key for use in encryption. And you can actually show it to a user and have a reasonably good chance that they will be able to tell it apart from something else. It rely on the ability of humans to recognize shapes, but it will be very hard for them to actually tell someone your key. There has been a lot of research around such things, and while it isn’t a primary motivation for us, it is a very nice perk.

I mentioned that a key interest for us is the usage in licensing code. Here is an example of how a license email will now look:

I think that in addition to being pretty, it is also going to bring a smile to people faces, so the Emoji Encoder is a win all around.

Low level Voron optimizationsPrimitives & abstraction levels

time to read 5 min | 858 words

One of the things that I noticed with the recent spate of work we have been doing is that we are doing things that we have already tried, and failed. But suddenly we are far more successful. What is the difference?

Case in point, transaction merging and early lock release. Those links both go to our initial implementation that was written in 2013. That is four years ago. Yes, today I can tell you that transaction merging was able to give us two orders of magnitude improvement and early lock release gave us 45% boost in performance. But looking at the timeline, we rolled back early lock release in early 2014.

The complexity of the feature is certainly non trivial, but the major point that led to its removal in 2014 was that it wasn’t worth it. That is, it didn’t pay enough to be worth the complexity it brought. When we sat down to design Voron for RavenDB 4.0, one of the first areas that we sought to eliminate was the transaction merging. We wanted Voron to be single threaded, by design.

And I still very much stand by those decisions. So how can I reconcile both statements? The core difference between them is where those are located, and what this means.

Transaction merging now is not done by Voron, instead, this is something that RavenDB does on top of Voron. But why?

When we had transaction merging in Voron, it meant that we had to submit transactional work to Voron in a format that it could understand. And Voron is a very low level library, so it doesn’t really understand much. This gave us very small “vocabulary” to work with. More than that, it also meant that we had to deal with such features as explicit concurrency (at the Voron level, on top of the concurrency primitives exposed by RavenDB). Let us take the simplest example. We have two threads that want to write to the same document.

That means that we have to build the buffer we want to write in memory, then submit it to Voron, with the right concurrency setting at the Voron level. This is after we already checked the concurrency semantics at the RavenDB level, and with double cost to ensure that a concurrency conflicts at all levels are properly handled. From the point of view of Voron, that meant much more common merged transaction failing (which kills perfromance) and much higher complexity overall when using it. Alongside that, we also have much higher memory usage, because we have to allocate buffers to hold the data we need to write, then submit it to Voron, so the rate of allocations was much higher.

We still saw performance improvement over not using it, but nothing that was really major as two orders of magnitude that we see today. Another aspect of this is that when we built Voron, we built it to fit our existing architecture (which was built on top of Esent), so it reflect a lot of design decisions coming from there.

With RavenDB 4.0, we took a few giant steps back and decided to design the whole system as a single integrated piece. In fact, that meant that any attempt to do concurrency at the Voron level was abandoned. That meant that the moment you had a write transaction, you were safe from concurrency, you didn’t have to worry about anyone modify the data you were looking at. There was no need to allocate special buffers and hold them, because we are always writing directly to Voron, instead of buffering in memory.

This was a dramatic simplification of the API and its usage, and it meant that the code is much more approachable and easy to understand, work with and make performant. Of course, it also meant that we had a serial lock, which is where the transaction merger became such a huge deal. But the point here is that this kind of transaction merging isn’t done at the Voron level, but at the RavenDB level, and instead of submitting primitive operations we can submit full fledge work items, including logic. So writing a document is now done by the request thread parsing the document, preparing a MergedPutCommand class and submitting it to the transaction merger.

The transaction merger will then execute the command under a write transaction, and it will directly manipulate Voron. This means that we get both high concurrency and safety from concurrency issues at the same time.  Early lock release plays into that as well, we had to modify Voron to allow that, but what we did was to build low level primitives that can be used by higher levels, without making assumptions on their usage.

On the Voron side of things, we just have the notion of async commit (with a list of requirements that happen to be exactly fit what is going on in the transaction merging portion in RavenDB), and the actual transaction lock handoff / early lock released is handled at a higher layer, with a lot more information about the system.

Why you should avoid graceful error handling like the plague that it is

time to read 3 min | 536 words

A while ago I was reviewing a pull request by a team member and I realized that I’m looking at an attempt to negotiate graceful termination of a connection between two nodes. In particular, the code in question was invoked when one node was shutting down or had to tear down the connection for whatever reason.

That code was thrown out, and it made a very graceful arc all the way to the recycle bin.

But why? The underlying reason for this was to avoid needless error messages in the logs, which can trigger support calls and cost time & effort to figure out what is going on. That is an admirable goal, but at the same time, it is a false hope and a dangerous one at that.

Let us consider what it means that a node is shutting down. It means that it now needs to notify all its peers about this. It is no longer enough to just tear down all connections, it need to talk to them, and that means that we introduced network delays into the shutdown procedure. It also means that we now have to deal with error handling when we are trying to notify a peer that this node is shutting down,  and that way lead to madness.

On the other hand, we have the other node, which node needs to also handle its peer getting up in the middle of the conversation and saying “I’m going away now” mid sentence. For that matter, since the shutdown signal (which is the common case for this to be triggered) can happen at any time, now we need to have thread safety on shutdown so we can send a legible message to the other side, and the other side must be ready to accept the shutdown message at any time. (“Do you have any new documents for me” request that expects a “There are N messages for you” now also need to handle “G’dbye world” notification).

Doing this properly complicates the code at every level, and you still need to handle the rude shutdown scenario.

Furthermore, what is the other side is supposed to do with the information that this node is shutting down the connection voluntarily? It is supposed to not connect to it again? If so, what policy should it use to decided if the other side is down for valid reasons or actually unavailable?

Assuming that there is actually a reason why there is a TCP connection between the two nodes, any interruption in service, for whatever reason, is not a valid state.

And if we ensure that we are always ending the connection in the same rude manner, we also gain a very valuable feature. We make sure that the error handling portion of the code get exercised on a regular basis, so if there are any issues there, they will be discovered easily.

As for the original issue of reducing support calls because of transient / resolved errors. That can be solved by not logging the error immediately, but waiting a bit to verify that the situation actually warrants writing to the operations log (writing to the info log should obviously happen regardless).

The struggle with Rust

time to read 5 min | 983 words

So I spent a few evenings with Rust, and I have managed to do some pretty trivial stuff with it, but then I tried to do something non trivial (low level trie that relies on low level memory manipulations). And after realize that I just have to fight the language non stop, I am not going to continue forward with this.

Here is where I gave up:

image

I have a buffer, that I want to mutate using pointers, so I allocated a buffer, with the intent to use the first few bytes for some header, and use the memory directly and efficiently. Unfortunately, I can’t. I need to mutate the buffer in multiple places at the same time (both the trie header and the actual node), but Rust refuses to let me do that because then I’ll have multiple mutable references, which is exactly what I want.

It just feels that there is so much ceremony involved in getting a Rust program to actually compile that there isn’t any time left to do anything else.  This post certainly resonated with me strongly.

That is about the language, and about what it requires. But the environment isn’t really nice either. It starts from the basic, I want to allocated some memory.

Sure, that is easy to do, right?

  • alloc::heap::allocate is only for unstable, and might change underneath you.
  • alloc::raw_vec::RawVec which give you raw memory directly is unstable and likely to remain so. Even though it is much safer to use than directly allocating memory.

We are talking about allocating memory, in a system level language, and unless you are jumping through hops, there is just no way to do that.

I’ll admit that I’m also spoiled in terms of tooling (IDEs, debuggers, etc), but the Rust environment is pretty much “grab a text editor, you’ll have syntax highlighting and maybe something a bit more”, and that is it. I tried three or four different editors, and while some of intellij-rust, for example, was able to do some code analysis, it wasn’t able to actually build anything (I think that I needed to install JDE or some such), VS Code could build and run (but not debug) and it marks every single warning with, and combined with Rust’s eagerness of warning, made it very hard to write code. Consider when all you code looks like this:

image

No debugger beyond println (and yes, I know about GDB, that ain’t a good debugging experience) is another major issue.

I really want to like Rust, and it has some pretty cool ideas, but the problem is that it is just too hard to actually get something done in any reasonable timeframe.

What is really concerning is that any time that I want to do anything really interesting you need to either go and find a crate to do it (without any assurances of quality, maintainability, etc) or you have to use a nightly version or enable various feature flags or use unstable API versions. And you have to do it for anything beyond the most trivial stuff.

The very same trie code that I tried to write in Rust I wrote in one & half evenings in C++ (including copious searches for the correct modern ways to do something), and it works, it is obvious and I don’t have to fight the compiler all the time.

Granted, I’ve written in C++ before, and C# (my main language) is similar, but the differences are staggering. It isn’t just the borrow checker, it is the sum of the language features that make it very hard to reason about what the code is doing. I mentioned before that the fact that generics are resolved on usage, which can happen quite a bit further down from the actual declaration is very confusing. It might have been different if I have been coming from an ML background, but Rust is just too much work for too little gain.

One of my core requirements, the ability to write code and iterate over behavior quickly is blocked because every time that I’m trying to compile, I’m getting weird complication errors and then I need to implement workarounds to make the compiler happy, which introduce complexity into the code for very simple tasks.

Let us take a simple example, I want to cache the result of a DNS lookup. Here is the code:

We’ll ignore the unstable API usage with lookup_host, or the fact that it literally took me over an hour to get 30 lines of code out in a shape that I can actually demonstrate the issue.

There is a lot of stuff going on here. We have a cache of boxed strings in the hash map to maintain ownership on them, and we look them up and then add a cloned key to the cache because the owner of the host string is the caller, etc. But most importantly, this code is simple, expressive and wrong. It won’t compile, because we have both immutable borrow (on line 17) and a mutable one (on line 25 & 26).

And yes, I’m aware of the entry API on the HashMap that is meant to dealt with this situation. The problem is that all those details, and making the compiler happy in a very simple code path, is adding a lot of friction to the code. To the point where you don’t get anything done other than fight the compiler all the time. It’s annoying, and it doesn’t feel like I’m accomplishing anything.

FUTURE POSTS

No future posts left, oh my!

RECENT SERIES

  1. RavenDB 4.0 (4):
    19 May 2017 - Managing encrypted databases
  2. De-virtualization in CoreCLR (2):
    01 May 2017 - Part II
  3. re (17):
    26 Apr 2017 - Writing a Time Series Database from Scratch
  4. Performance optimizations (2):
    11 Apr 2017 - One step forward, ten steps back
  5. RavenDB Conference videos (12):
    03 Mar 2017 - Replication changes in 3.5
View all series

RECENT COMMENTS

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats