Ayende @ Rahien

Hi!
My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by email or phone:

ayende@ayende.com

+972 52-548-6969

, @ Q c

Posts: 6,354 | Comments: 47,171

filter by tags archive

I was wrong, reflecting on the .NET design choices

time to read 3 min | 480 words

I have been re-thinking about some of my previous positions with regards to development, and it appear that I have been quite wrong in the past.

In particular, I’m talking about things like:

Note that those posts are parts of a much larger discussion, and both are close to a decade old. They aren’t really relevant anymore, I think, but it still bugs me, and I wanted to outline my current thinking on the matter.

C# is non virtual by default, while Java is virtual by default. That seems like a minor distinction, but it has huge implications. It means that proxying / mocking / runtime subclassing is a lot easier with Java than with C#. In fact, a lot of frameworks that were ported from Java rely on this heavily, and that made it much harder to use them in C#. The most common one being NHibernate, and one of the chief frustrations that I kept running into.

However, given that I’m working on a database engine now, not on business software, I can see a whole different world of constraints. In particular, a virtual method call is significantly more expensive than a direct call, and that adds up quite quickly. One of the things that we routinely do is try to de-virtualize method calls using various tricks, and we are eagerly waiting .NET Core 2.0 with the de-virtualization support in the JIT (we already start writing code to take advantage of it).

Another issue is that my approach to software design has significantly changed. Where I would previously do a lot of inheritance and explicit design patterns, I’m far more motivated toward using composition, instead. I’m also marking very clear boundaries between My Code and Client Code. In My Code, I don’t try to maintain encapsulation, or hide state, whereas with stuff that is expected to be used externally, that is very much the case. But that give a very different feel to the API and usage patterns that we handle.

This also relates to abstract class vs interfaces, and why you should care. As a consumer, unless you are busy doling some mocking or so such, you likely don’t, but as a library author, that matters a lot to the amount of flexibility you get.

I think that a lot of this has to do with my view point, not just as an Open Source author, but someone who runs a project where customers are using us for years on end, and they really don’t want us to make any changes that would impact their code. That lead to a lot more emphasis on backward compact (source, binary & behavior), and if you mess it up, you get ricochets from people who pay you money because their job is harder.

A tricky bit of code

time to read 1 min | 111 words

I run into the following bit of code while doing a code review on a pull request:

This was very strange, because the code appeared to compile properly, but it shouldn’t. I mean, look at it. The generic parameter is not constrained, and I don’t have any extension methods on Object that can apply here, so why would this compile?

The secret was in the base class:

Basically, we specified the constraint on the abstract method, and then inherited it, which was really confusing to me until I figured it out.

You can’t do the same with interfaces, though, although explicit interface implementation does allow it.

Sometimes it really IS not our fault

time to read 1 min | 150 words

imageSo we got an emergency support call during the Passover holiday, and as you can imagine, it was a strange one. Our investigation of the error basically boiled down (cutting down a lot of effort in between): “This can’t be happening.”

I hate this kind of answer, because it usually means that we are missing something. Usually that can be a strange error code, some race condition or just something strange about the environment.

While we were working the problem, the customer came back with, “Oh, we found the issue. A memory unit went rogue, and the firmware wasn’t able to catch it.” When they updated the firmware, it apparently caught it immediately.

So I guess we can close this support incident. Smile

How does LZ4 acceleration work?

time to read 3 min | 403 words

LZ4 has an interesting feature, acceleration. It allows you to modify the compression ratio (and the corresponding compression speed). This is quite interesting for several scenarios. In particular, while higher compression rate is almost always good, you want to take into account the transfer speed as well. For example, if I’m interesting in writing to the disk, and the disk write rate is 400 MB / sec, it isn’t worth it to use the default acceleration level (which can produce about 385MB / sec), and I can reduce that so the speed of compression will not dominate my writes.

You can read more about it here.

We started playing with this feature today, and that brought the question, what does this actually do?

This is a really good question, but before I can answer it, we need to understand how compression works. Here is a very simple illustration of the concept.

We created a very stupid size constrained hash table, that maps from the current 4 bytes to the previous instance where we saw those 4 bytes. When we find a match, we check to see how much of a match we have, and then write it out as a back reference.

Note that if we don’t find a match, we update the last match position for this value, and move one byte forward, to see if there is a match in that location. This way, we are scanning the past 64KB to see if there is a match. This is meant to be a very rough approximation to how compression work, don’t get too hang up on the details.

The key point with acceleration is that it impacts what we’ll do if we didn’t find a match. Instead of moving by one byte, we are going to skip by more than that. The actual logic is in here, and if I’m reading this correctly, it will probe the data to compress in increasingly wider gaps until it find a match that it can use to reduce the output size.

What acceleration does is tell it to jump in even wider increments as it searches for a match. This reduce the number of potential matches it find, but also significantly reduces the amount of work that LZ4 need to do with comparing the data stream, hence how it both accelerate the speed and reduces the compression ratio.

Single roundtrip authentication

time to read 5 min | 851 words

imageOne of the things that we did in RavenDB 4.0 was support running on Linux, which meant giving up on Windows Authentication. To the nitpickers among you, I’m aware that we can do LDAP authentication, and we can do Windows Auth over HTTP/S on Linux. Let us say that given the expected results, all I can wish you is that you’ll have to debug / deploy / support such a system for real.

Giving up on Windows Authentication in general, even on Windows, is something that we did with great relief. Nothing like getting an urgent call from a customer and trying to figure out what a certain user isn’t authenticated, and trying to figure out the trust relationship between different domains, forests, DMZ and a whole host of other stuff that has utter lack of anything related to us. I hate those kind of cases.

In fact, I think that my new ill wish phrase will become “may you get a 2 AM call to support Kerberus authentication problems on a Linux server in the DMZ to a nested Windows domain”. But that might be going too far and damage my Karma.

That lead us to API Key authentication. And for that, we want to be able to get authenticate against the server, and get a token, which can then use in future requests. An additional benefit is that by building our own system we are able to actually own the entire thing and support it much better. A side issue here is that we need to support / maintain security critical code, which I’m not so happy about. But owning this also give me the option of doing things in a more optimized fashion. And in this case, we want to handle authentication in as few network roundtrips as possible, ideally one.

That is a bit challenging, since on the first request, we know nothing about the server. I actually implemented a couple of attempts to do so, but pretty much all of them were vulnerable to some degree after we did security analysis on them. You can look here for some of the details, so we gave that up in favor of mostly single round trip authentication.

The very first time the client starts, it will use a well known endpoint to get the server public key. When we need to authenticate, we’ll generate our own key pair and use it and the server public key to encrypt a hash of the secret key with the client’s public key, then send our own public key and the  encrypted data to the server for authentication. In return, the server will validate the client based on the hash of the password and the client public key, generate an authentication token, encrypt that with the client’s public key and send it back.

Now, this is a bit complex, I’ll admit, and it is utterly unnecessary if the users are using HTTPS, as they should. However, there are actually quite a few deployments where HTTPS isn’t used, mostly inside organizations where they choose to deploy over HTTP to avoid the complexity of cert management / update / etc. Our goal is to prevent leakage of the secret key over the network even in the case that the admin didn’t setup things securely. Note that in this case (not using HTTPS), anyone listening on the network is going to be able to grab the authentication token (which is valid for about half an hour), but that is out of scope for this discussion.

We can assume that communication between client & server are not snoopable, given that they are using public keys to exchange the data. We also ensure that the size of the data is always the same, so there is no information leakage there. The return trip with the authentication token is also

A bad actor can pretend to be a server and fool a client into sending the authentication request using the bad actor’s public key, instead of the server. This is done because we don’t have trust chains. The result is that the bad actor now has the hash of the password with the public key of the client (which is also known). So the bad guy can try do some offline password guessing, we mitigate that using secret keys that are a minimum of 192 bits and generated using cryptographically secured RNG.

The bad actor can off course send the data as is to the server, which will authenticate it normally, but since it uses the client’s public key both for the hash validation and for the encryption of the reply, the bad actor wouldn’t be able to get the token.

On the client side, after the first time that the client hits the server, it will be able to cache the server public key, and any future authentication token refresh can be done in a single roundtrip.

Command line usability

time to read 2 min | 312 words

I hate the new “dotnet test” command. I don’t think that anyone ever thought about looking at the output it generates for real projects.

For example, here is a section form the log output our fast tests:

image

There is so much crap here, including duplicate information and a whole bunch of mess that it is very hard to find relevant information. For example, there is a failing test here. How long will it take you to find it?

Another important aspect for us is the fact that this will actually run the same process. If you have something that will crash the test process, you’ll never get to see what is going on. Here is what a crash due to stack overflow looks like using “dotnet test”

image

As a result, we moved to dotnet xunit, which is a much better test runner.

image

We get color coding, including red for failing tests, so we don’t have to hunt them.

What is more important, it will not hide crucial information from us because it feels like it. If there is a crash, we can actually see what happened.

image

I know it sounds trivial, but “dotnet test” doesn’t have it.

Trying to live without ReSharper in Visual Studio 2017

time to read 5 min | 944 words

This is an experiment that is doomed to failed, but given that I just setup a new VS 2017, I decided to see how it would feel to run it without ReSharper and how it impacts my workflow. Please note that this is very much not an unbiased review. I have been using ReSharper for my day to day coding for over a decade, and the workflow it enables is deeply rooted in how I work. I’m going to ignore any differences in key bindings, as irritating as that can be, in favor of just looking at different features.

So far, I spent a couple of days trying to work on VS 2017 without ReSharper. It has been quite frustrating, but I was able to sort of limp along. I most certainly felt the lack.

My hope was that I would be able to see the promised performance improvements without it, and then consider whatever it is worth it. That wasn’t the case.

image

As you can see, ReSharper is not installed, but I managed to get VS into a hang several times. It seems to happen with NuGet, or when trying to use the Test Explorer and a few times when I was trying to edit code while the solution was compiling.

Without any meaningful order, here are the things that I really felt the lack of.

  • Go to definition with automatic decompile is something that I apparently use a lot more than I expected. It helps figuring out what I can expect from the method that I’m looking at, even when it is not our code that I’m looking at.
  • Refactor method ignoring whitespace lets me just write a statement and it becomes a method name. This is actually quite nice.
  • Quick docs in R# is very nice, that is, the ability to hit Ctrl+Q and get the docs for a method is something that I seem to be using a lot. This is important because I can quickly check the docs (most often, what conditions it has for returning, or specific arguments. The key here is that I don’t need to leave my current context. I can Ctrl+Q, peek at the docs, and then move on.

image

  • Extract variable isn’t there, and so are a lot of the refactoring that I’m used to aren’t there or are hardly accessible.
  • IntelliSense is also a lot less intelligent. Being able to write a method and just Ctrl+Space all the parameters because R# can fill the context is very useful.
  • Ctrl+N, (symbol search) is also a LOT more useful. I’m familiar with the Ctrl+; to search solution explorer, but the features don’t compare. In one case, I get live feedback, which means that I don’t have to remember nearly as much about the symbol that I’m looking for. On the other hand, I have to write it and hit enter to see the results. There is also an issue with the presentation. Solution explorer is really poor model for it.
    image image

    There is a lot of wasted space in the VS model versus the R# model.
  • Update: I have since learned about VS' Ctrl+, feature, and that seems much nicer, and it also does auto peeking, which I like.

In general, to be honest, R# feels smarter (remember, I’m biased and likely work to the strengths of R#). But another aspect here is that with R#, I rarely have to leave my current context, pretty much everything is available immediately from where I am.

Even something as simple as the search above. With R#, this shows up in the middle of the screen, with VS, that is all the way at the right, so I need to move my eyes to track it. The same is pretty much true for everything else. Reference search in R# shows where I’m looking at right now, and with VS, it shows in a window in the bottom. Refactoring options in VS show up in the top right, and it is easy to miss them completely, R# put them in the front, along with what you are working on right now.

I’m going to install R# for VS 2017 shortly, and then I’ll be able to compare the speed, but I’m pretty sure that I’m not going to be very happy with that. Then again, once it is loaded, I haven’t noticed R# + 2015 being much worse than 2017 without Resharper.

Not that I’m doing this during my usual work, on a solution with 55 projects and 820 KLOC.

Update

I have tried R# & VS 2017 for a couple of days now, and I can tell that aside from the project open times (which are absolutely atrocious with R#), I’m not seeing anything major performance wise.

That said, project open time are also “switch between branches”, and that is a major PITA.

Of course, I’m guessing R# is really popular, because:

image

I can guess someone was tired of “visual studio is slow” when it is someone else’s code, and they wrote this to point the blame on the relevant extension so the bug report would go to the appropriate people.

Emoji Encoding: A new style for binary encoding for the web

time to read 4 min | 604 words

Computers think in binary, and you would have thought that sending binary data around would be pretty easy. But that turns out to be a completely non trivial task. The problem is those pesky humans and needing to interface with them.

For example, if I need to send some binary data over email, I can either do that as attachment, with high probability of at least a few people never getting it, or I can encode it somehow. Typical choices are Base64 encoding for the low tech and barcodes / QR code and the like. For the fancy among us, we can try go with Base85 and other such things. That is pretty standard, but it really has a lot of limitations. Base64 will increase the size of the data by 25%, and it is case sensitive, so it is hard to get right if you need to actually look at it and not just copy/paste it. It is also limited to plain old ASCII, for compatibility reasons that don’t make a lot of sense in today’s world.

I have been thinking about this for a long time, because we need to send binary data (license information) in text, and we also need that to look well and formatted.

After a lot of thought and experimentation, I’m proud to announce a new form of encoding: the Emoji Encoder, available currently for .NET, but soon to be available for Ruby, Python, Go, Node.JS, Ember.js, React.JS and maybe jQuery.

The idea for this innovation came to me because of the following observations:

  • Emojis are becoming much more important in any textual conversation (to the point where people will say an emoji). That mean that we can rely on them for long term, which is very important for storage technology.
  • Trying to read meaning from emojis being sent is clearly impossible, as anyone taking a peek at a text conversation between two teenage girls can say. (Although they appear to have a hidden meaning, if she sent the red heel and not the blue heel emoji that apparently means something.)
  • Because emojis are so relevant, they can be sent anywhere a normal text would go, including email, social media, printing, etc.
  • There are a lot of emojis, allowing us to overcome the bloat of Base64 and its friends by dedicating a single emoji for each byte in a 1:1: mapping.

That means that in terms of characters, Emoji Encoding is a net win. Consider the following equivalent information:

  • I5xy4dT9Qyjp7DKwuVI6y95EwlDeO/NBeiuc3GJ5Mjo= <—45 characters
  • ℹ⤴⚫✔⭕㊗◀☔➖✂♥⛵✖♍❤⛵✅✏ℹ⛲✂ <—33 characters

That is quite important when dealing with constrained textual formats, such as twitter, where the above will be rendered as:

There are other advantages. This data is actually a 256 bits key for use in encryption. And you can actually show it to a user and have a reasonably good chance that they will be able to tell it apart from something else. It rely on the ability of humans to recognize shapes, but it will be very hard for them to actually tell someone your key. There has been a lot of research around such things, and while it isn’t a primary motivation for us, it is a very nice perk.

I mentioned that a key interest for us is the usage in licensing code. Here is an example of how a license email will now look:

I think that in addition to being pretty, it is also going to bring a smile to people faces, so the Emoji Encoder is a win all around.

Externalizing the HttpClient internals for fun & profit

time to read 2 min | 368 words

In many respects, HttpClient is much better than using the old WebRequest API. It does a lot more for you and it is much easier to use in common scenarios.

In others, the API is extremely constraining. One such example is when you want to do incremental generation of the request (maybe based on other things that are going on). Using WebRequest, this is trivial, you get the request stream, and just start writing to it, but with HttpClient, the actual request stream is hidden several layers too deep to be very useful.

Sure, you can use PushStreamContent to actually generate the data to write to the stream, but it doesn’t help if you need to be called with more information. For example, let us imagine the following interface:

image

It is a pretty silly one, but it should explain things. We are first calling Init, and passing it the url we want to POST to, and then we upload multiple files to the servers. Using HttpClient, the usual way would be to gather all the file names during the Upload method, and then use PushStreamContent to push it all to the server in the Done method.

This is awkward if we have a lot of files, or if we want to generate and delete them after the upload. Luckily, we can cheat and get the same behavior as we can in WebRequest. Let us examine the code:

The first thing we do is spin a POST request to the server, but we are doing something strange, instead of generating the data when we are called on SerializeToStreamAsync, we are exposing it outside, and then returning another task. Effectively, we are telling the HttpContent that we are busy now, and it shouldn’t bug us with details until we’ll let it know.

Then, we wait to get the stream, and then we can start uploading each file in turn. At the end, we need to let the HttpClient that we are done sending data to the server, at which point we just need to wait for the server response, and we are done.

When I gave up on pointers

time to read 2 min | 372 words

hI started programming with that orange turtle ( I think it was supposed to be green, but we had bad CRT screens ) by drawing stuff on the screen. I think that I was in fifth grade or so. I later graduated to VB (IIRC, that was VB3 or VB4), but my first formal programming education was done in Pascal. And I was pretty good (for a high school kid who merely dabbled), but I just couldn’t figure out pointers. I mean, they made absolutely no sense whatsoever.

Take the example of an “infinite” size stack, that was the example that we were given during class, and I just couldn’t follow it. Take the stack example, like so:

You might notice that this code is limited, if you are storing more than 3 items, the value will be silently ignored, which is probably not what you want. High school me would agree that this is bad, and therefor increase the STACK_SIZE variable to a ridiculously high size, such as 100). That would surely be big enough for everything, right?

I remember really struggling with the concept of dynamic memory management, not so much as because of the API, but because I  couldn’t make any sort of sense about what I was supposed to do there.

After high school, I went and high end course in C++. That took about a year, and I highly recommend the course (even though I don’t think they run it), it taught me a lot about basic stuff such as how things actually work. We started with low level C in DOS, and build on top of that all the way to MFC and ATL. And at some point, the instructor introduced dynamic memory management. And it was so blindingly obvious that I never actually realized that I’m learning the same concept that gave me so much grief in the past.

I had that experience several times since then. I try to learn something, and I just bounce, hard. At a while later, I do the same thing, or fight a slightly  different, and get it. Bug I’m no sure how I go from “what the hell” to “oh, this is obvious”.

FUTURE POSTS

  1. De-virtualization in CoreCLR: Part I - 3 hours from now
  2. De-virtualization in CoreCLR: Part II - 3 days from now

There are posts all the way to May 01, 2017

RECENT SERIES

  1. re (17):
    26 Apr 2017 - Writing a Time Series Database from Scratch
  2. Performance optimizations (2):
    11 Apr 2017 - One step forward, ten steps back
  3. RavenDB Conference videos (12):
    03 Mar 2017 - Replication changes in 3.5
  4. Low level Voron optimizations (5):
    02 Mar 2017 - Primitives & abstraction levels
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats