Ayende @ Rahien

Hi!
My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by email or phone:

ayende@ayende.com

+972 52-548-6969

, @ Q j

Posts: 6,569 | Comments: 48,197

filter by tags archive

Book recommendations in the test of time

time to read 8 min | 1409 words

Technical books are interesting. Some of them last for decades, some of them are valid only for a session. I had a few discussions recently about books in a conference, in particular, what books would I recommend. That got me to really think about the topic. There are a lot of books that I think were really valuable for me when I read them that wouldn’t really make sense to recommend / talk about today. Not because they are bad books, but because both the industry and the reader have changed.

imageimageConsider a book that I was really impressed with at the time: Patterns of Enterprise Application Architecture.

It is a great book, and quite interesting. But if you’ll try to write anything based on its contents, you are doing yourselves and your employer a great disservice. This is a book that you’ll read, today, to understand how the already pre-existing libraries and frameworks are put together. Interesting, certainly, but much less relevant than when it came out over 15 years ago.

There are a lot of such cases. Books that are relevant for either the time period in which they were written, or sometimes even to the time period in the career of their reader. An example of a book that I was quite taking with is the Code Complete book. For very much the same reasons as the PoEAA book, they are much less relevant now than when they came out. This is because the ideas exposed in these books have won, they are both ubiquitous and expected.

That is not to say that you’ll always find them, or proper behavior, but that is the ground floor on which you’re expected to start from, not something that you need to aim at and strive for.

Because of this, I am actually struggling to think about good technical books that I believe would withstand the test of time. Anything that is too tech specific usually have an expiration date attached to it. And even if we are talking about concepts and ideas, I’m interested in things that will give me more than just information, but actually provide something more. A good book in this regard is something that would change how I’m doing things for a long time. A compiler book would tell me about how to write parsers and work with AST, how to generate code and and lot of details in this nature. And that would be very valuable, but it would also usually be knowledge that is very specific to a task and place. It will be generally applicable.

Thinking back over all the technical books I have read, there are just a few that I can point to and say: “This book changed the way I write code and build systems”. And these books typically are still relevant today and I can happily recommend them developers at every stage of their career.

The ones that really pops to mind are:

imageRelease It!

I read it a few times, which is pretty rare for me with technical books and I got a few copies floating around in the office so I can tell people, “Read this and you’ll get it”.

The ideas there about building robust production systems, what are the challenges and the things to watch out for are invaluable. The patterns outlined in the book, anything from circuit breaker to explicit transparency have been invaluable for the software I write.

I do have to point out that the tech in the book is often Java (circa 2010, I guess). So when the books discusses specific options, that is often not relevant, but the content and the ideas are fascinating and had made a major impact on how I write code and architect systems.

imageWorking Effectively with Legacy Code

I remember reading this book and going “Ahhh” several times over. The book talks about how you can approach a legacy codebase and make changes to it, and presumably it is useful in this regard. I read it very early in my career, before I really had the chance to get enough code to call it legacy and I have used the techniques in the book to avoid getting myself into too much trouble over time.

I should note that a lot of the things that are discussed, such as creating seams in the system so you can write tests for it are actually very useful for many other things. One of the things that I have noticed is that I will routinely make use of such seams for debugging things. To provide additional behavior and insight into what the system is doing explicitly to find a particular issue.

For some reason, I haven’t seen a lot of usage of that, but I consider debug hooks to be a really important feature of good software and I started doing that as a direct result of the kind of things that I read in Working Effectively with Legacy Software.

imageOperating Systems Concepts

I have to admit that I have a much older edition of this book, and I like that cover a lot more. But I think you will be more interested in hearing about the contents of this book.

This book, as well as Operating Systems Design and Implementation, cover how operating systems actually work, the major components and how they are actually put together. The topic may seem pretty academic and not of much use for application developers but I found it fascinating when I read it for the first time and I think it is very relevant today. Not so much the details, which are quite often different between operating systems and operating systems versions but the actual high level concepts.

It also help in understanding what is actually going on when you are running code on a machine. Things like how threading is implemented and the idea of how the OS gets to decide what runs, how memory works and how you can make use of that, etc.

I would be able to write RavenDB if I didn’t have a good grasp of all these details and the 4.0 release has been quite explicit about building software so the operating system can help us, instead of having to fight it. In order to do that, we needed to understand how the OS works, what it expects applications to do and how to actually make the best use of that.


You might note that there are a lot of books that aren’t here. Nothing about source control or writing tests, the Pragmatic Programmer or Design Patterns. To head things off at the pass, it isn’t that these are not important, but at this point, talking to an experienced  developer, I just assume that that kind of knowledge is already ingrained.


image

Federico had the following recommendation. Modern C++ Design is one of those books that literally break your understanding on how code is built and interpreted. I remember having taken this book back in early 2002 when I saw it standing on the counter of the computer science department library. I somehow convinced the secretary to give it to me under the promise of returning it, because there was a professor waiting for it. I read it completely in a week. End result, either this guy was absolutely crazy or I didnt understood a thing (latter I discovered it was not the former). So I did what anyone responsable enough would do; start all over again, and not return the book. Had to read it 3 times in the space of a month to barely grasp the concepts and got fined because of the late return 1 month later :D

I would love to hear about the books that you found fundamental to your career.

Book recommendationSerious Cryptography

time to read 2 min | 291 words

imageI haven’t done a technical book recommendation for a while, but I think that this is a really great book to break that streak.

Serious Cryptography talks about cryptography, obviously, but it does it in such a way that it is understandable. I think that it is unique in the sense that most of the other cryptography books and materials that I have read started from so many baseline assumptions or were so math heavy that they were not approachable. The other types of cryptography books, like the Code Book are more in the sense of popular science. They give you background, but nothing actionable.

What I really liked about Serious Cryptography (henceforth, the book) is that it is a serious discussion of cryptography without delving too deeply into math (but with clear explanations and details on it) and that it is practical. Oh, it isn’t an API guideline and it isn’t something that you can just pick up and learn cryptography, but it does an amazing good in laying out the field and explaining all sort of concepts and ideas that are generally just assumed.

I read it in two days, because it was fascinating reading and because it is relevant to what I’m actually doing. Some of the most fun parts is “how things fail” when the author discuss various failure that happened in the real world, what caused them and what actions were taken as a result.

If you have any interest in security whatsoever, I highly recommend this book.

And if you have good technical books, especially in a similar vein, I would love to hear about it.

PR ReviewThe simple stuff will trip you

time to read 2 min | 285 words

In a recent PR, I run into this code, which is used in query generation to decide if we need to quote a particular alias. The code itself is pretty straightforward and easy to follow:

image

image

It also have two distinct issues. First, there is the allocation because of the ToUpper call and second, we are doing O(N) search on the alias array every single time.

I asked for a change, to use HashSet and to use the OrdinalIgnoreCase comparer.

Here is the change I got back:

image

image

This is exactly what I asked for, and it is very subtly wrong. We are now saving an allocation, which is great, but the problem is with the Contains method.

image

This looks okay, but this is not HashSet.Contains, instead, this is an extension method from Enumerable.Contains, which iterates over the set and compare each value.

The fix is also simple:

image

image

And now we don’t have O(N) anymore.

Although I’ll admit that for such small size, it probably doesn’t matter.

PR ReviewEncapsulation stops at the assembly boundary

time to read 2 min | 247 words

The following set of issues all fall into code that is used within the scope of a single assembly, and that is important. I’m writing this blog post before I got the chance to talk to the dev in question, so I’m guessing about intent.

image

This change is likely motivated by the fact that callers are not expected to make a modification to the resulting dictionary.

That said, this is used between different components in the same assembly, and is never exposed outside. That means that we have a much higher trust between the components, and reading IReadOnlyDictionary means that we need to spend more cycles trying to figure out who you are trying to protect from.

Equally important, in this case, the Dictionary methods can be called without any virtual call overhead, while the IReadOnlyDictionary needs interface dispatch to work.

image

This is a case that is a bit more subtle. The existingData is a variable that is passed to a method. The problem is that in this case, no one is ever going to send null, and sending a null is actually an error.

In this case, if we did get a null, I would rather that the code would immediately crash with “what just happened?” rather than limp along with bad data.

PR ReviewBeware the things you can’t see

time to read 1 min | 110 words

I had to reject the following change in a recent PR. IN this context, the flags and conflicted.Flags are the same, and that wasn’t the problem. Can you spot the issue?

image

The problem is that the second version does an allocation. It does this silently, and you need to know about this issue to know that this happens. There is good discussion on this in this StackOverflow question.

It looks like this has been fixed in the JIT for CoreCLR and will be part of the 2.1 release when it is out.

Keeping track on long running branches

time to read 2 min | 387 words

imageI talked about the Merge Games in somewhat of a jest, but more seriously, there is a lot to worry about once you have long running branches. In our case, it isn’t so much that we have a lot of long running branches as we have a ton of changes that are happening in multiple branches in parallel and it sometimes can take a few weeks until the work is done and we can merge it all.

This put a lot o pressure on the code review part of the process. One of the things that I really like with GitHub is the PR / review processes, and it works great when you have a small commits / PRs. The problem is that when you are talking about large scope of work, you are left with few options for proper review.

One option is to get a PR with dozens of commits, and having to slog through each of them to understand what is going on. Another is to get a PR with a single commit, that contains a lot of changes. This means that you have to grasp the whole change in one shot. Either option is really hard, and can lead the reviewer to skim through the code.  That isn’t something that we want to do, instead, we really want to pay as much attention to the code as we did while writing it.

My process for handling this is to lean heavily on GitHub. What I do is create a PR very early in the process, sometimes immediately after the first commit in that branch. That gives me the ability to review things incrementally. Instead of having to deal with it all at once, I can review the changes as they come in. Whenever one of the developers push their commits, I’ll get a notification about be able to go over the details and comment on the spot.

That shortens the feedback cycle and remove a lot of the complexity from the review process. It also means that we can more easily note that one developer is doing something that is also being done by another team, so we can integrate the work earlier in the process.

Let the Merge Games begin!

time to read 2 min | 275 words

imageWe currently have four different teams that are working on large modifications to RavenDB.

Large modifications means that they are working on a feature for a relatively long time and very frequently need to do modifications to large swaths of the code base. Oh, and the common theme for all of them is that they are all big enough that it means that you cannot just merge back into the main branch. There are often a lot of failing tests or even uncompilable state during a long refactoring session.

The good thing is that we are pretty good about making sure that we merge from the main branch on a regular basis. The bad thing is that once we start merging those big changes, the other large refactoring are going to have to deal with a lot of changes happening very quickly.

Hence, the merge games. The fastest one of those team is able to hit the “can we put this in the main branch” point, the less work it is going to be for them. On the other hand, the slower you are in getting to that point, the more conflicts you are likely to run into and have to resolve.

I don’t think it would be a best seller book series and I doubt that I’ll get a movie deal from it, but in a certain select group of people, I think that this will be an amazingly fun game (as long as you aren’t the one left holding the shitty end of the merge conflicts).

PR ReviewIs your error handling required?

time to read 1 min | 129 words

During reviewing a PR I run into what seemed like a strange thing. Take a look at this change:

image

This came with its own exception class, and left me pretty confused. Why would I want to have something like that?

Here we have some error handling code that doesn’t seem to add any additional value. Everything in the error here can be deducted from the details of the exception that will be thrown if we did nothing.

The fact we throw a specialized exception might be meaningful, but looking at the code, this isn’t actually used for anything.

Like all code, error handling needs to justify itself, and this one doesn’t pass the bar.

Reviewing ResinPart VI – Analyzing I/O and being unfair

time to read 4 min | 752 words

imageLooking back at this series, I have the strong feeling that I’m being unfair to Resin, I’m judging it using the same criteria I would use to judge our own production, highly optimized code. The projects have very different goals, maturity and environments. That said, I think that a lot of the comments I have to the project are at the implementation level. That is, they can be fixed (except maybe the analyzer / tokenizer pipeline) by simply optimizing one method at a a time. Even the architectural change with analyzing the text isn’t very big. What is important is that the code is quite clear, easy to follow and have a well defined structure. That means that it is actually possible to make this changes as the project mature.

And now that this is out of the way, let me cover some of the things that I would have done differently in the codebase. A lot of them are around I/O related. In particular, the usage of all those different files and the way this is done is decidedly non optimal. In particular, opening and closing of the files constantly, reading and seeking all over the place, etc. The actual design seems to be based around LSM, even if this isn’t state explicitly. And that have pretty good semantics already for writes, but reads currently are probably leaning very heavily on the file system cache, but that won’t work as the data grows beyond a certain scope.

When running on a Unix system, you also need to consider the fact that there is a limit to the number of open files you have, so smaller number of files are generally preferred. I would go with merging all those files into a single large one, similar to the compound format that Lucene uses.

Once that is done, I would also memory map the entire file to memory and use directly memory accesses to handle all I/O. This has several very important advantages. First, I’m being a lot more explicit about using the file system cache, and that would allow us to avoid a lot of system calls. Second, the data is already mostly structured as arrays, so it would be very natural to do so. This also avoid the need to manually buffer things in our own memory, which is always nice.

Next, there is the need to consider consistency checks. Resin as it stands now (I’m not sure if this is an explicit design decision) takes the position that it is not its job to ensure file consistency. Lucene make some attempt to ensure consistency, and usually fails at that horribly at the most inconvenient moments. Adding a hash to the file will allow to ensure that the data is okay, but it means  having to read the entire file when you open it, which is probably too expensive.

The other aspect that need attention is the data structure used. In particular LcrsTrie is a good idea to save space and might work well for in memory usage, but it isn’t a good choice for persistent data structures. B+Tree or SST are the common choices, and need to be evaluated for the job.

As part of this, and quite important, I would recommend getting a look at the full I/O status. That means:

  • How you write to disk?
  • How you update data?
  • Do you have write amplification (merges)?
  • Are you trying for consistency / ACID?
  • Can you explain how your data is persisted, using algorithms / approaches that are well known and trusted?

The latter is good both for external users (who can then reason about your system) and for yourself. If you are using LSM, you know that you have a set of problems (compactions, write amplifications) and solutions (auto optimize over time, etc) that is well known and you can make use of that. If you are using B+Trees, then the problems and solutions space are different, but there is even more information about them.

If you are using consistency, are you using WAL or append only? What are your consistency guarantees, etc?

Those are all questions that needs answers, and they have an impact on the design of the project as a whole.

And with this, this series is over. I have to say that I didn’t think that I would have so much to write about. It is a very interesting project.

FUTURE POSTS

No future posts left, oh my!

RECENT SERIES

  1. Production postmortem (22):
    22 Feb 2018 - The unavailable Linux server
  2. Challenge (50):
    31 Jan 2018 - Find the bug in the fix–answer
  3. Production Test Run (6):
    26 Jan 2018 - Overburdened and under provisioned
  4. Inside RavenDB 4.0 (3):
    22 Jan 2018 - Chapter 10 is done
  5. Reminder (8):
    09 Jan 2018 - Early bird pricing for RavenDB workshops about to close
View all series

RECENT COMMENTS

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats