Ayende @ Rahien

Oren Eini aka Ayende Rahien CEO of Hibernating Rhinos LTD, which develops RavenDB, a NoSQL Open Source Document Database.

You can reach me by:

oren@ravendb.net

+972 52-548-6969

Posts: 6,916 | Comments: 49,398

filter by tags archive
time to read 2 min | 339 words

imageI used the term “Big Red Sales Button” in a previous post, and got a question about it. You can see an illustration of that on the right.

The Big Red Sales Button (BRSB from now on) is a metaphor used to discuss how sales can impact an organization. It is common for the sales team to run into new customer requirements. Some of them are presented as absolute requirements (they usually aren’t).

I have found that the typical response of the sales person at this point is to reply “of course we can do that”, go back to the office and hit the BRSB and notify the dev team that they have $tooShortTimeFrame to implement said feature.

In one very memorable case, I remember going over contract details and trying to figure out what we need to do there. Right there, in a minimum seven figures contract, there was a clause that explained what the core functionality of the system and the set of features that were required for it to be accepted.

Most of it was pretty normal business application, nothing too strange. But section 5.1.3.c was interesting. In it, in dense legalese, there was a requirement to solve the traveling salesman problem. To be fair, it wasn’t actually that problem, it was a scheduling problem and I used the traveling salesman as the name for it because it is easier than trying to explain NP complete issues to layman.

I’ll spoil the ending of this post and reveal that I did not solve an NP complete problem. I cheated like hell and actually solved the issue they had (if you limit the solution space, you drastically simplify the cost of a solution).

Sometimes, the BRSB is used for a good purpose. If you have something that can help close a major sale, and it isn’t outrageous to implement it, for example. But in many cases, it is open for abuse.

time to read 2 min | 330 words

I spent the last couple of days in the O’Reilly Architecture Conference and HIMSS (Healthcare Information and Management Systems Society) Conference. During that time, I had the chance of listening to quite a few technical marketing spiels.

Some of them were technically very impressive, but missed the target by a planet or two. I came up with a really nice analogy for how such presentations do a great disservice for their purpose.

Consider the following:

This non-steroidal drug has been clinically tested and FDA approved will cease the production of prostaglandins and has a significant antiplatelet effect. It’s available in tablet and syrup forms and is suitable for IVs. May cause diarrhea and/or vomiting.

This is factual (at least as much as I could make it), I assume that if you are a medical professional you might be able to work out possible uses for this drug. But the most important thing that is missing from this description? What does this do?

This is Ibuprofen and you take it to ease your headache (among many other uses). It can also protect help you avoid blood clots.

I intentionally chose this example, because it is a very obvious one (and I just came back hearing way too much medical stuff). You begin by telling me how this will ease the pain. In many ways, I consider technical marketing to be composed of the following steps:

  • Whatever this product can actually ease the pain.
  • Whatever this customer actually experience the pain.

For example, if you are promising to have a faster than light bullet-train to Mars,  that is going to cast some… doubt on your claims. On the other hand, it doesn’t matter to me if you can cut down my commute time in half if I can get to work while not leaving my house.

If the customer experienced the pain and believe that you can actually help there, you are most of the way there. All that is left is just negotiating, barrier removal, etc.

time to read 3 min | 512 words

imageI’m going to feel like an old man for this post, but if you were born post 1995, it is likely that you have no idea what I’m talking about in this post, crazy as this sounds to me.

Before there was a phone in every pocket, there were land lines. It is like today’s phone, but much larger, you could only do voice calls and if you wanted to screen your calls you needed to buy another appliance. If you’ll watch the first few sessions of Friends, you’ll see how important a detail that can be. If you were out of the house or office and needed to place a call, you could use something called a public phone booth or a pay phone.

Sadly, the easiest way I can convey what this was is to invoke the Tardis. A small booth in which you had a public access phone. Because phone calls used to cost a lot, these phone had a way to drop some coins or tokens into the phone to pay for the phone call.

As a child, I didn’t have a wallet and still needed to occasionally make calls. Being stuck without cash at hand wasn’t such a strange thing so there was another way to perform the call. You could reverse the charge, instead of the person placing the call paying for it, you could call collect. In that case, the person answering the call would be paying for it. Naturally, since money is involved, you need the other party to accept the charge before actually charging them.

At some point in time, you called a special number and told the operator what number you wanted to do a collect call. The operator would ring this number and ask for permission to connect the call and charge the receiver. I think that the rate for a collect call was significantly higher than the normal call, so you wouldn’t normally do that.

As part of the system automation, the phone company replaced the manual operator collect call with an automated system. You would record a short message, which would be played to the other party. If they wanted to accept the call (and the charge), the could press 1 on the phone, or disconnect to avoid the charge.

As a kid, I quickly learned that instead of telling the other party who is calling and why (so they would accept the call), I could just tell them what my phone number is. In this way, they would write down the number, refuse the call and then call me back. That would avoid the collect toll charge.

I remember that at some point the phone company made the length of the collect hello message really short, but I got around that by speaking really fast (or sometimes by making two separate calls). I remember having to practice saying the phone number a few times to get it done in the right time.

time to read 2 min | 396 words

I just had a discussion with a colleague about a fix of non trivial code. The question was what comments should go into the code to explain what was going on.  If you care to know, this related to the prefetching strategy that is used by RavenDB to reduce the amount of I/O that is required (especially on slow disks). The details don’t actually matter. The problem is that there are multiple relatively complex issues there, from managing I/O to thread safety in the critical code path (using dirty reads intentionally), etc.

The problem with doing this is that the code is complex but it is a fairly straightforward progress from the kind of code we usually write in performance sensitive sections. The fear was by over commenting the code, we’ll get ourselves into a situation where we’ll be making the code too malleable to change. This is the kind of code that sits in the perf critical section, you change it after fasting for a day or two (with strong encouragement on meditation about little vs. big endian and why half endian is so rare).

In other words, in practice. You change it when you have reason, and you back up that change with a battery of performance tests. Anything from the usual benchmarks to running production loads on various machines to poring over system traces.

Given the amount of effort that is expected from any changes to this code, I consider it to be a good idea for people who read it to understand that there is a hurdle there that must be jumped before it should be modified. Thus, we decided to skip some of the comments on the reasoning behind the overall design. Here is the most important comment in this code, this is there to explain a particular choice of value and the reasoning that must be applied when it is changed.

What about the whole complexity of the prefetching in general? That isn’t document in code, because reading code comments scattered throughout will make it very hard to grok. This is detailed in the architecture guide that go over these details.

For myself, I find it really awesome to go over a codebase and figure out what reasoning lie behind the code. But when I have people working on my projects? It is better to give them a hand than a riddle.

time to read 1 min | 136 words

This is a screen shot from a CV I just read:

image

The CV itself is kind of boring. Just graduated, did so and so course and was excellent in foo and bar.

We see dozens of CVs like that on a regular basis. But the portfolio link was very nice. It linked to a Google Drive folder with a bunch of games that the candidate made, in various languages.

I didn’t actually went and read all the code, but I really skimmed through a bunch of projects there. I actually like the portfolio a lot better than a github link. A portfolio is explicitly about showing a potential employer what you can do. A github link can be used for many things.

time to read 3 min | 505 words

When a candidate sends a CV and includes a GitHub profiler, that almost always guarantees that I’ll give that profile a look. The most interesting thing from my perspective in a GitHub profile is that it allows me to look at the candidate’s work. There aren’t that many candidates with GitHub profile links, and not having a link isn’t something that will cause me to rule out a candidate. But I thought it would be interesting to share some of my finding from such trawling of repositories.

Here is an example of something that I don’t like:

image

In fact, in most code bases, I’ll skim very quickly to find the data access code. SQL Injection is a pet peeve of mine, and seeing how a candidate’s code handle user’s input is an easy way to get a first impression. It isn’t always indicative of “this person has no skills and is careless”, mind. But I found that it is a good place to start. Especially because mostly I’ll see sample projects and half finished stuff. So seeing how they treat this particular issue (which is easily found and should be familiar to most developers) is a good quick check. Then again, here is the same candidate, with another repository:

image

This is using Hibernate, by the way. And that kind of hurt my feelings, to be fair.

On the other hand, a different candidate:

image

That is a much better, and show that they pay attention to other functional requirements.

In general, I consider the presence of a GItHub link in a CV as an invitation to evaluate the candidate’s work and will do so with the goal of understanding their approach, the quality of their code and their skills. As such, if you include a GitHub link in a CV, I would recommend consider this to be your public face and a criterion for evaluation.

This is an advantage. It means that the GitHub link mere existence make you pop out of the crowd. On the other hand, it also means that your code is under scrutiny.

I’m advising here for people starting out, without much background. As such, having a straightforward way to be evaluated on your skills is a plus. I would suggest making it easier. For example, a clear README is nice, especially if you explain what you were trying to do. “Playing around with Angular to see how it feels” is a great thing to have, because it gives context to the person reading your code. Especially for web applications and client side work, having a visible demo that I can quickly look at is great.

On the other hand, having well known bad practices (such as SQL Injection, plain text passwords, etc) in the code is a big negative.

time to read 1 min | 108 words

I just got a CV from a candidate looking for a junior position. I looked at the CV (and oh my God, did this guy have a lot of acronyms in there). I noted that he has a GitHub account in the CV, so naturally I checked it.

There is a single repository there, which I’ll present to you in all its glory:

image

This is actually a negative. If he didn’t have a GitHub account, I wouldn’t have minded. But including one that is in this shape is not a good idea.

time to read 2 min | 275 words

imageMy daughter is 3​¼ years old now. About the time that she was born, I decided that I needed to make a small change in my language. Whenever I felt the urge to curse, I would say a food’s name. For example, after being puked on, my reaction would be some variant of: “PASTA”, “PASTA BOLOGNESE” or other pasta’s favorites.

As time went by, I got better and better at expressing emotion through increasingly disturbing food references. My current favorite is: “Pasta Bolognese with pickled carrots in a bun with anchovies and raw eggs”.

A couple of days ago, I took my daughter and a friend to an ice cream shop. As expected of an ice cream shop in the middle of (very hot) summer, the place was packed. My daughter was quite excited to go there and expressed her emotions by standing up and shouting at the top of her lungs (she is three, with a voice that carry like a foghorn): “PASTA! PASTA BOLOGNESE” over and over again.

This is a crowded shop, full of small kids and parents. I got some looks for the little girl holding up a full ice cream cone and shouting about pasta, but it was infinitely preferable to the alternative.

An unforeseen side effect, however, is that because I can, I’m very free with pasta based profanities. This had led to what is effectively a competition, with her trying to cause me to go overboard with that.

And now I must go back to work, before the gluten police arrival.

time to read 2 min | 273 words

imageI just got back from watching the Incredibles 2. The previous movie was a favorite a mine from first view, and it is one of the few movies that I can actually bear to watch multiple times. I was hoping for a sequel almost from the moment I finished the first movie, and it took over a decade to get it.

I actually sat down with my 3 years old daughter to watch the first movie before I went to see the second one. I’m not sure of how much she got from it, although she is very fond of trains and really loved the train scene (and then kept asking where is the train). It is unusual for me to actually “prepare” to see a movie, by the way. But it did mean that I had the plot sharp in my head and that I could directly compare the two movies.

First, in terms of the plot. It was funny, especially since I got a kid now and could appreciate a lot more of the not so subtle digs at parenthood.

Second, in terms of visuals, wow, it improved by a lot. The original movie held up really good in terms of visuals in the past 12 years, but the new one is visibly better in this term.

Also, at this point I nearly got a heart attack because a talking book (again, my daughter) starting neighing at me at the middle of the night just as I got into the house.

Highly recommended.

time to read 2 min | 372 words

hI started programming with that orange turtle ( I think it was supposed to be green, but we had bad CRT screens ) by drawing stuff on the screen. I think that I was in fifth grade or so. I later graduated to VB (IIRC, that was VB3 or VB4), but my first formal programming education was done in Pascal. And I was pretty good (for a high school kid who merely dabbled), but I just couldn’t figure out pointers. I mean, they made absolutely no sense whatsoever.

Take the example of an “infinite” size stack, that was the example that we were given during class, and I just couldn’t follow it. Take the stack example, like so:

You might notice that this code is limited, if you are storing more than 3 items, the value will be silently ignored, which is probably not what you want. High school me would agree that this is bad, and therefor increase the STACK_SIZE variable to a ridiculously high size, such as 100). That would surely be big enough for everything, right?

I remember really struggling with the concept of dynamic memory management, not so much as because of the API, but because I  couldn’t make any sort of sense about what I was supposed to do there.

After high school, I went and high end course in C++. That took about a year, and I highly recommend the course (even though I don’t think they run it), it taught me a lot about basic stuff such as how things actually work. We started with low level C in DOS, and build on top of that all the way to MFC and ATL. And at some point, the instructor introduced dynamic memory management. And it was so blindingly obvious that I never actually realized that I’m learning the same concept that gave me so much grief in the past.

I had that experience several times since then. I try to learn something, and I just bounce, hard. At a while later, I do the same thing, or fight a slightly  different, and get it. Bug I’m no sure how I go from “what the hell” to “oh, this is obvious”.

FUTURE POSTS

  1. Researching a disk based hash table - 3 hours from now

There are posts all the way to Nov 14, 2019

RECENT SERIES

  1. re (24):
    12 Nov 2019 - Document-Level Optimistic Concurrency in MongoDB
  2. Voron’s Roaring Set (2):
    11 Nov 2019 - Part II–Implementation
  3. Searching through text (3):
    17 Oct 2019 - Part III, Managing posting lists
  4. Design exercise (6):
    01 Aug 2019 - Complex data aggregation with RavenDB
  5. Reviewing mimalloc (2):
    22 Jul 2019 - Part II
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats