Ayende @ Rahien

Hi!
My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by email or phone:

ayende@ayende.com

+972 52-548-6969

, @ Q j

Posts: 6,707 | Comments: 48,617

filter by tags archive

Answering the web developer task

time to read 1 min | 101 words

In my previous post, I talked about a task we give candidates that interview for the web developer position. They need to implement the following:

Given that I don’t like handing our tasks that I haven’t done, I took a few minutes to answer my own question. Here is how this can be implemented:

I believe that I mentioned that my JavaScript skills are from the last decade, if that, so I’m probably committing quite a few sins against JavaScript (if that is even possible), but this code run the first time I tried it and gave the proper result.

Interview question for Web Developer

time to read 2 min | 281 words

One of the roles we are looking for right now is for a web developer. We are looking for someone who can do great things on the browser and write good, maintainable code. I’m not a web developer, haven’t been in a while, but it has been really interesting to see the interview process. In particular, I have great fondness for the following line of questions.

What does this code do?

I love this because it is simple, short and can reveal a lot about the mental model that the candidate has about how JavaScript works. If they hesitate too much in answering this, we typically just run this snippet in the browser and ask them to explain the results. This is important because it shows that they understand the execution model, how code is interleaves, etc.

The next stage is to ask them what the following snippet will do:

I actually have zero expectation that they will be able to answer this correctly. We let them think about this for a few seconds and then run it in the browser. Asking them to explain why it gave the output it gave is a lot more interesting.

The final piece of this line of questions is to have them implement the following:

And here we get to also investigate how they are thinking about code. This isn’t a trivial thing to implement, because you need to understand lambdas, how to coordinate several actions into a cohesive whole, etc.

We usually ask people to describe us how they would handle something like this, not actually write the code. And hearing the thought process that the candidates go through as they solve this can be illuminating.

Reading OSS code to figure out what is actually going on

time to read 3 min | 440 words

I use Open Live Writer to post to this blog, the problem is that whenever I post a new post, it opens up the metadata api endpoint in the browser (services/metaweblogapi.ashx). I actually want to see the blog post that I just posted. I decided that this was annoying enough that I’m going to figure out how this is done and see if there is a way for my blog to give Open Live Writer the address of the newly created post.

I want this to be a focused operation, I don’t wanna read through it all. So I’m going to see if I can figure out how this works with a minimum of effort. I know that OLW is opening the browser after the post is published, this is usually done with Process.Start, so I run the following query:

image

The very first result is promising, showing ExecuteFile. This sounds interesting, let’s see how this is used. No one seems to be calling this method, but reading through the ShellHelper file, I run into LaunchUrl(), which seems promising. Searching for this method got me to some interesting locations, including the ViewPage method, which seems to be exact what I want.

image 

This seems to indicate that the blog post should support pages, not sure what this is about, but I found this piece of code by searching for IsPage:

image

Not sure what pages are, but looking at the configuration for my blog, I see:

image

Continuing my blaze through the code, I can see we have:

image

My blog doesn’t implement this method, but OLW doesn’t probe for this. It seems that because I’m using the generic interface, it already pre-loaded the available options there. What this means is that this exploration ended up at a dead end. I figured out roughly what is going on, but actually getting all the details is probably too much of a hassle for me to debug through the OLW code and update my blog engine. I’m already used to just closing the newly opened tab and go to the new post directly. I’ll keep this in my todo tasks for when I actually get around to doing this.

The perf optimization that cost us

time to read 4 min | 609 words

There is a lock deep inside RavenDB that has been bugging me for a while. In essence, this is a lock that prevents a read transaction from opening during a particular section of the commit process of a write transaction. The section under lock does no I/O and only manipulate in memory data structure. Existing read transactions aren’t impacted, it is only the opening of a read transaction that is blocked. It isn’t usually visible in production systems. The only scenario you might expect to see it is if you have a very high degree of writes at the same time as you have a lot of read requests. In that case, RavenDB’s lock contention would rise and you would be able to process only about 5,000 – 10,000 read requests per second at the same time as you can process 30,000 write requests per second.  In practice, because of the way we process requests, you’ll very rarely be able to hit that level of concurrency in anything but a benchmark. But it bugged me. It was one of those scenarios that stood up like a sore thumb when we looked at the performance metrics. In any other case, we are orders of magnitudes faster, but here we hit a speed bump.

So I fixed it. I changed the code so there would be no sharing of data between the write and read transactions, which eliminated the need for the lock. The details are fairly simple, I moved the structured being modified to a single class and then atomically replace the pointer to this class. Easy, simple and much much cheaper.

This also had the side affect of slowing us down by a factor of 4. Here are the profiler results after the change:

image

That… is not a desirable property for a performance optimization.

Luckily, I was able figure out from the profiler what was causing the slowdown:

image

As it turned out, there was another lock there. This one was used to avoid starving a particular portion of the code under high write scenario. Until we removed the first lock, this worked beautifully. But when the first lock was removed, that actually sped us the rest of the system which meant that we’ll acquire the second lock a lot more often. That, in turn, results in a rare occurrence becoming very common and the contention on the lock cause a major slowdown.

We were able to figure out that the lock is now no longer needed, due to some other changes. We just removed this lock entirely, giving us over 10% perf over the original code, without even mentioning the actual scenario we are trying to test.

image

Now, you might have noticed that this benchmark I’m running is for pure writes, right? Let’s see what happens when we have more complex workloads, and are testing things without the profiler:

ReadsWrites2/3 Reads & 1/3 Writes
Original 385,510 85,688 91,984
Optimized416,46890,38092,323

This table shows the number of scenarios and the requests per second on each. We can see just under 10% improvement for reads, over 5% improvement for writes (which are excellent numbers). However, the mixed workload, where we expected to see the most performance boost has a difference that is pretty much a sampling error.

I’m not sure how I feel about that.

Playing with graphs and logic systems

time to read 3 min | 554 words

imageRecently I have been playing with graphs a bit, trying to understand them in more depth. Because I learn much better by doing, I thought that I would build a toy graph query engine to see how that works. I loaded the MovieLens small data set into a set of C# classes and started playing with them.

Here is what the source data looks like:

I’m not dealing with typical issues, such as how to fetch the data, optimizing indexes, etc. Instead, I want to focus solely on the problem of finding patterns in the graph.

Here is a simple example of a pattern in the graph:

(userA:User)-[:Rated]->(movie:Movie)<-[:Rated]-(userB:User)

The syntax is called Cypher, which is commonly used for graph queries.

What we are trying to find here is a set of triads. User A who rated a movie that was also rated by user B. The result of this query is a list of tuples matching (userA, movie, userB).

This is really similar to the way I remember learning Prolog, so I thought about giving it a shot and solving the problem in this way.

The first thing to do is to break the query itself into independent steps:

(userA:User)-[:Rated]->(movie:Movie) AND (userB:User)-[:Rated]->(movie:Movie)

Note that in this case, the first and second queries are exactly the same, but now they are somewhat easier to reason about. We just need to do the match ups property, here is how I would write the code:

This query can take a while to run, because on the small data set (with just 100,004 recommendations and 671 users) there are over 6.2 million such connections. And yes, I used join intentionally, because it show case the interesting problem of cartesian product.

Now, these queries aren’t really interesting and they can be quite expensive. A better query would be to find the set of movies that were rated by both user 1 and user 306. This can be done as simple as changing the previous code starting location:

Again, this is a pretty simple scenario. A more complex one would be to find a list of movies a particular user has not rated that were rated by people who liked the same movies as this user. As a query, this will look roughly like this:

(userA:User)-[:Rated(Rating >= 4)]->(:Movie)<-[:Rated(Rating >= 4)]-(userB:User) AND (userB:User)-[:Rated(Rating >= 4)]->(notRatedByA:Movie) AND NOT (userA:User)-[:Rated]->(notRatedByA:Movie)

Note that this merely specify the first part, find me users that liked the same movies as userA.  The second part is a bit more complex, we want to find movies rated by the second users and exclude movies rated by the first. Let’s break it into its component parts, shall we?

Here is the code for the first clause:  (userA:User)-[:Rated(Rating >= 4)]->(:Movie)<-[:Rated(Rating >= 4)]-(userB:User)

As you can see, the output of this code is a set of ( userA, userB ). Now, let’s go to the second one, shall we? We already have a match on userB in this case, so we can start evaluating that. Here is the next stage: (userB:User)-[:Rated(Rating >= 4)]->(notRatedByA:Movie)

Now we have the last stage, where we need to filter things out:

And now we have the final results.

For me, thinking about these kind of queries as a “fill in the blanks” makes the most sense.

Pruning issues and the idle bin

time to read 4 min | 666 words

imagePart of the job of a product owner is to pay attention to the list of issues in the issue tracker. Not just to get a feeling for the cadence of the project, but to have an impact on its direction.

Paying attention to the issues doesn’t mean just tracking down what bugs are still opened, mind. Consider the case of a product owner with the release due date looming over the horizon, you need to start looking at the list of remaining issues and take active steps to make sure that you are going to get done more or less on time.

The usual rules apply, chose any 2 of:

  • Speed
  • Quantity
  • Quality

In other words, your team can deliver more features in time, if you are willing to sacrifice quality. On the other hand, they can keep high quality and the same number of features, but the due date will have to move.

As an aside, it is possible to get all three of these aspects at once, but only for a very short amount of time (few days to a week or two at most), but at a very high long term cost.

One of the things that I observed is that in some cases, a lot of complexity and work is in the last 2% of work, where all the the polish work and rough edge cases lurks. In some respects, this is actually a really good thing. Because it gives the product owner the chance to remove features that won’t usually have an explicit impact on the users.  A good example for this in RavenDB would be the amount of time and effort we put into the intellisense feature of RQL queries in the studio. That falls under the Nice To Have set of features. It is unlikely that we’ll get many upset users if the intellisense isn’t up to part with something like Visual Studio or ReSharper, so beyond getting some basic functionality right, we can defer improvements there if we don’t have the extra capacity to complete this by the expected date.

I’m sure that you can think of other examples in your own projects. Note that this require you to understand what exactly your users are valuing your software for. In the case of RavenDB, adding more query functionality and speeding up overall system performance ranks much higher than adding extra smarts to intellisense that is mostly used during exploration / demos.

On the other hand, the effect of the pushing such features down the road accumulate over time. In other words, if you keep your priorities straight and select which features should go into the product, you will defer the small fries over and over. At some point, you’ll need to make a decision about them. You can either decide that they don’t make sense anymore or they are never really going to be important enough to actually put in the “let’s get this done” queue.

Alternatively, you might want to put them in the idle bin. In other words, whenever you have an idle portion in your development, you can peek into the idle bin and get some tasks from there. That is also a good place to have a new team member start from. These are tasks that are minor and not that important, after all, so they can use that to learn the codebase. In fact, we have used this in the past as the tasks bin for interns. That is usually a really good fit, for the same reason that they are good tasks for a new team member with the added benefits that they are usually well scoped and if the intern messes up, you didn’t lose too much.

Regardless, the idle bin notion is important, because otherwise your future tasks queue is going to grow larger and larger, and it will be ever harder to figure out what tasks actually matter.

On interns and hiring people at the first stages of their career

time to read 4 min | 738 words

imageWhen looking for candidates, there is an ideal candidate. It is the ability to take one of the people already working for you, with all the domain knowledge and expertise and clone them. Hopefully multiple times. If you do this right, you can probably stick the clones in a basement with a bunch of computers, slide Pizza under the door every so often and get a lot of work done for the price of Pizza.

While this (dystopian) scenario is quite nice in terms of overall effort, I do believe that there are some issues with it. Naturally the biggest hurdle is the medical bills for cloning people, there are also some noise about this being inhumane. The real issue, of course, is the lack of feasible technology to accelerate the growth of the clones. I’m sure this will be solved at some point. My time machine comes back from the shop on Monday (and isn’t that ironic), so I’ll be investigating this further at that point.

Setting the clone wars option aside, there is the need to get new hires. And there are several ways to do go about that. You can try getting people with some or all the skills that you require. Or you can get someone that is a blank slate and train them internally. This post is about the later option.

The question is really what do you actually define as a blank slate. For example, hiring my 3 years old daughter as a software developer would be really nice. She is a blank slate, but given that we are currently teaching her to count to 20, I think that this might be premature.

To be perfectly honest, the amount of knowledge that is required to be an efficient developer is staggering. If I was to start the clock from scratch, I think that I would be sitting there twiddling my thumbs to this day, scared of all the things that I must understand to be effective. In some way, not knowing how much I don’t know was really helpful. It allowed me to go out and learn without being overwhelmed. If I looked at just C# and compare the language from 1.0 to 7.3, for example. Each change made sense at the time, and incrementally added to the language. Some of them were bigger than others (generics, linq) but they came in byte size chunks (typo intended). Trying to grok it all at once… much harder.

We actually hire fairly often directly from college. Either immediately after completing the degree or even beforehand. We usually look for people that have gone beyond the rote learning for the good grade but are actually able to understand why this are happening, not just what API to call. Our most junior hire ever had just finished high school and had a few months free before going to the army, effectively being an intern in the company for a short while.

The approach we take for onboarding a new employee (with no practical experience) and an intern is quite different. For a full time employee, my priority is to get them well situated and familiar with how we work and the overall codebase. That means that the typical first assignments will be things that are on the sidelines. Things that are okay if they take a little longer, since they are used to get the new developer familiar with the landscape of the code. Examples include writing new clients, building internal applications using RavenDB, becnhmarking work and building diagnostics and debug tools for production analysis.

For an intern, however, the situation is different. Given that I’m only going to have the intern for a few short months,  spending 2 – 3 months training to the expected level of a full time employee is going to be a waste. Instead, we try to give the intern experimental and research projects. Things that we wished we could have done if we had the time, but typically do not. Some of them are pretty complex, but the key “feature” in this regard is that they are possible to approach without having to have a deep understanding of RavenDB.  For example, SQL Migration, one of the main features of RavenDB 4.1, was actually initially developed by an intern.

Living in the foundations, missing all the amenities

time to read 2 min | 377 words

imageWe talked to a candidate recently with a CV that included topics such as Assembly, SQL and JavaScript.  The list of skills was quite eclectic and we called the candidate to hear more about them.

The candidate completed a two years degree focused on the foundations of development, but it looked like whoever designed it was looking primarily to get a good foundation more than anything else. In other words, the end result is someone that can write SQL queries, but never built a data driven application, who knows (about? I’m not really clear at what level that was) assembly, but never written a real application. It doesn’t sound bad, I know, but it was like moving into a new house just after the contractor is done with the foundation. Sure, that is a really important part, but you don’t even have walls yet.

In 1999, I did a year long course that was focused on teaching me C and C++. I credit this course for much of my understanding of the basics of programming and how computers actually work. It has been an eye opening experience. I wouldn’t hire my 1999’s self, as I recall, that guy (can I deny knowing him?) wrote the following masterpieces:

  • sparse_matrix<T> in C++ templates that used five (5!) levels of pointer indirection!
  • The original single page application. I wrote an entire BBS system using a a single .VBS script that used three levels of recursive switch statements and included inline HTML, JS and VB code!

These are horrible things to inflict on an innocent computer, but that got me started in actually working on software and understanding things beyond the basics of syntax and action. I usually take the other side, that people are focused far too much on the high level stuff and do not pay attention to what is actually going on under the hood. This was an interesting reversal, because the candidate was the opposite. They had some knowledge about the basics, but nothing to build upon that yet.

And until you actually build upon the foundation, it is just a whole in the ground that was covered in some cement.

Working with legacy embedded types inside documents

time to read 2 min | 338 words

imageDatabase holds data for long periods of time. Very often, they keep the data for longer than single application generation. As such, one of the tasks that RavenDB has to take care of is the ability to process data from older generations of the application (or even from a completely different application).

For the most part, there isn’t much to it, to be honest. You process the JSON data and can either conform to whatever there is in the database or use your platform’s tooling to rename it as needed. For example:

There are a few wrinkles still. You can use RavenDB with dynamic JSON objects, but for the most part, you’ll use entities in your application to represent the documents. That means that we need to store the type of the entities you use. At the top level, we have metadata elements such as:

  • Raven-Clr-Type
  • Raven-Java-Class
  • Raven-Python-Type
  • Etc…

This is something that you can control, using Conventions.FindClrType event. If you change the class name or assembly, you can use that to tell RavenDB how to treat the old values. This require no changes to your documents and only a single modification to your code.

A more complex scenario happens when you are using polymorphic behavior inside your documents. For example, let’s imagine that you have an Order document, as shown on the right. This document has an internal property call Payment which can be any of the following types:

  • Legacy.CreditCardPayment
  • Legacy.WireTransferPayment
  • Legacy.PayPalPayment

How do you load such a document? If you try to just de-serialize it, you’ll get a deserialziation error. The type information about the polymorphic property is encoded in the document and you’ll need these legacy types to successfully load the document.

Luckily, there is a simple solution. You can customize the JSON serializer like so:

And the implementation of the binder is straightforward from that point:

In this manner, you can decide to keep the existing data as is or migrate it slowly over time.

Using GOTO in C#

time to read 2 min | 309 words

After talking about GOTO in C, I thought that I should point out some interesting use cases for using GOTO in C#. Naturally, since C# actually have proper methods for resource cleanups (IDisposable and using), the situation is quite different.

Here is one usage of GOTO in RavenDB’s codebase:

This is used for micro optimization purposes. The idea is that we put the hot spots of this code first, and only jump to the rare parts of the code if the list is full. This keep the size of the method very small, it allow us to inline it in many cases and can substantially improve performance.

Here is another example, which is a bit crazier:

As you can see, this is a piece of code that is full of gotos, and there is quite a bit of jumping around. The answer to why we are doing this is again, performance. In particular, this method is located in a very important hot spot in our code, as you can imagine. Let’s consider a common usage of this:

var val = ReadNumber(buffer, 2);

What would be the result of this call? Well, we asked the JIT to inline the method, and it is small enough that it would comply. We are also passing a constant to the method, so the JIT can simplify it further by checking the conditions. Here is the end result in assembly:

Of course, this is the best (and pretty common for us) case where we know what the size would be. If we have to send a variable, we need to include the checks, but that is still very small.

In other words, we use GOTO to direct as much as possible the actual output of the machine code, explicitly trying to be more friendly toward the machine at the expense of readability in favor of performance.

FUTURE POSTS

No future posts left, oh my!

RECENT SERIES

  1. Reviewing FASTER (9):
    06 Sep 2018 - Summary
  2. RavenDB 4.1 features (12):
    22 Aug 2018 - MongoDB & CosmosDB Migration Wizards
  3. Reading the NSA’s codebase (7):
    13 Aug 2018 - LemonGraph review–Part VII–Summary
  4. Codex KV (2):
    06 Jun 2018 - Properly generating the file
  5. I WILL have order (3):
    30 May 2018 - How Bleve sorts query results
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats