Developer prerequisites

time to read 6 min | 1093 words

I mentioned that we are currently hiring for a junior dev position and we have been absolutely swamped with candidates. Leaving aside the divorce lawyer that tried to apply to the position and the several accountants (I don’t really get it either) we typically get people with very little experience.

In fact, this position is explicitly open to people with no experience whatsoever. Given that most junior positions require a minimum of two years, I think that got us a lot of candidates.

The fact that we don’t require prior experience doesn’t meant that we don’t have prerequisites, of course. We are a database company and the fundamentals are important to us. A typical task in RavenDB involves a lot of taxes, from ACID compliance, distributed computing, strict performance requirements, visibility into the actions of the database, readability of the code, etc.

I talked before about the cost of a bad hire, and in the nearly a decade that passed since I wrote that post, I hasn’t changed my mind. I would rather end up with no one than hire someone that isn’t a match for our needs.

Our interview process is composed of a phone call, a few coding questions and then an in person interview. At this point, given that I have been doing that for over a decade, I think that I interviewed well over 5,000 people. A job interview stresses some people out a lot. Yesterday I had to listen to a candidate speak so fast that I could barely understand the words and I had to stop a candidate and tell them that they are currently in the 95% percentile of people I spoke to, so they wouldn’t freeze because of a flubbed question.

I twitted(anonymously) about the ups and down of the process and seem to have created quite a lot of noise. A typical phone call for a potential candidate takes about 15 – 30 minutes and is mostly there to serve as an explicit filter. If they don’t meet the minimum requirements that we have, there is no point in wasting either of our time.

One of the questions that I ask is: Build a phone book application that stores the data in memory and outputs the records in lexical order. This can stump some people, so we have an extra question to help. Instead of trying to output the data in lexical order, how would you ensure that you don’t have a duplicate phone number in such a system? Scanning through the entire list of records each time is obviously not the way to go. If they still can’t think of a way to do that the next hint is to think about O(1) and what data structure would fit this requirement. On the Twitter thread, quite a few people were up in arms about that.

Building a phone book is the kind of task that I remember doing in high school programming class as a teenager. Admittedly, that was in Pascal, but I just checked six different computer science degrees and for all of them, data structures was a compulsory course. Moreover, things like “what is the complexity of this operation” are things that we do multiple times a day here. We are building a database here, so operations on data is literally our bread and butter. But even for “normal” operations, that is crucial. A common example, we need to display some information to the user about their database. The information actually come from two sources internally. One is the database list which contains various metadata and one is the active database instance, which can give us the running stats such as the number of requests for this database in the past minute.

Let’s take a look at this code:

The complexity of this code is O(N^2). In other words, for ten databases, it would cost us a hundred. But for 250 databases it would cost 62,500 and for 500 it would be 250,000. Almost the same code, but without the needless cost:

This is neither theoretical nor rare, and it is part of every programming curriculum that you’ll find. Further more, I’m not even asking about the theoretical properties of various algorithms, or asking a candidate to compute the complexity of a particular piece of code. I’m presenting a fairly simple and common task (make sure that a piece of data in unique) and asking how to perform that efficiently.

From a lot of the reactions, it seems that plenty of people believe that data structures aren’t part of the fundamentals and shouldn’t be something that is deeply embedded in the mindset of developers. To me, that is like saying that a carpenter shouldn’t be aware of the difference between a nail or a screw.

Rob has an interesting thread on the topic, which I wanted to address specifically:

The first and most obvious problem is "what language?". JavaScript, Ruby, and Python have abstracted the idea of an array (and hash) so there are things you can do with them that might qualify as O(1) but then again, might not. I think what Oren was going for was ...
— Rob Conery (@robconery) February 25, 2021

It does not matter what language, actually. In JavaScript, you’ll not use a “new Hashtable()”, you’ll use an object, sure, but that is a meaningless detail. In fact, arrays implemented as hashes in JavaScript maintain most of their important properties, actually. O(1) access time by index being the key. If you want to go deeper, than in practice, the JS runtime will usually use an actual array if it detects that this is how you use the variable.

And I’m sorry, that is actually really important for many reasons. The underlying structure of our data has huge impact on what you can do with that data and how you can operate on it. That is why you have ArrayBuffer and friends in JavaScript now, because it matters, a lot.

And to the people whose 30 years experience never included ever needing to know those details, I have two things to say:

Either your code is full of O(N^2) traps (which is sadly common) or you know that, because lists vs. hash is not something that you can really get away with.
In this company, implementing basic data structures is literally part of day to day tasks. Over the years, we needed to get customize versions of arrays, lists, dictionaries, trees and many more. This isn’t pie in the sky stuff, that is Monday morning.

Tweet Share Share 7 comments

Tags:

Comments

26 Feb 2021
09:23 AM

Rémi BOURGAREL

I understand absolutely your point about the minimum you are asking for your applicant, and this is your company so ... . I improved a lot of code by switching loops to dictionary/hashtable usage but I didn't know about the big O notation, I just did it because it seems that it would do less work and it did. Which I can say is sad because once you can put a name/label on something then it's easier to find and fix it. I learned about the big O notation when I worked for my latest interview. I understand this is a phone call, but maybe you can send before the interview a link to an explanation about the big O. Or maybe ask "do you know about the big O notation ?" if they say no, then instead of saying "O(1)" maybe say "find in a single operation", you might be missing good candidate.

26 Feb 2021
10:24 AM

Olivier

I had a look at the job offer and the tweet and...

Well on the Twitter side, it seems over-reaction and not-so-civil discourse are the usual expression means over there... Yes I think that based only on the context you gave I'd have deemed the question kind of vague, but then I'd have supposed it was not so during the real interview or asked if it was instead of immediately treating you evil... Still I kind of agree with the fact that one can know a hash based structure would be better without being fully aware of its O(n) category. I think the suggestion to propose an implementation language makes sense not because it does matter, but rather so that the question is less scary and the interviewee is not stuck with too abstract a question. Then if you're looking for people able to abstract away their knowledge, it makes sense to stick to the abstract CS level. A functional language company may ask about monads, functors or even how to efficiently implement immutable data structures I guess.

But then if (real) knowledge of O(N) notation is important to you (and I'm not questioning the fact it is), maybe the LinkedIn offer should say that you require more than just "Basic knowledge of .NET". I've done a little bit of technical interviews myself with much lower requirements (you know, big software services company eating 10s of new junior devs for breakfast) and very few of my interviewees could have answered anything regarding O notation or found a correct solution to the problem either btw... To sum this up, I really think the offer should state you need people aware of (and loving) data structures and optimization... It's not clear as well that the product is a database, but then I think it'd be the minimal duty for any applicant to google "Hibernating Rhinos" if they're too young not to know it already.

Anyway, hiring is tough and all in all, you handle your interviews how you feel is the best for your company...

PS : completely off topic, but I always thought (for at least the 15y I know of your works) that you were from Brasil. I could not say really why I had this in mind. Maybe because of your name and alias - which btw I don't know which one is what - that sound Portuguese to my ears) :D

26 Feb 2021
21:15 PM

Donnie

I saw the tweet in near real-time. I'll confess that I read "has O(1) properties" as "is guaranteed O(1)" - which is not what was asked. That threw me off.

27 Feb 2021
07:58 AM

Rafal

I symphatize entirely with Oren's attitude. The current job market is one big green pasture for developers, and there are lots of wannabe programmers who will usually land a nice job much above their qualifications and much above salary they really deserve. And do they have any problems with, for example, the impostor syndrome - not at all! Ok, i understand 'fake it till you make it', but i've met some cases of 'fake it until the end of the world but never, under any circumstances, admit any shortcomings'. A developer who doesnt know the characteristics of basic algorithms and data structures is not a developer imho (not even a junior), and if you add unwillingness even to learn these things then you get a perfect recipe for a 'negative total worth' employee. The painful thing is that very often hiring decisions are made by someone else in the company and without any proper technical questioning of the candidate, and they get hired just because they are nice, can spin some 'awesome' stories on the interview and have some prior experience with big name company. And then you can't get rid of them, you must work with someone who can't contribute at the right level and is mostly useless.

28 Feb 2021
07:00 AM

Oren Eini

Remi,

The question wasn't about big O(1). It was about finding whatever a phone number was unique in a phonebook application. I gave the O(1) as a hint when the candidate flubbed the question.

Those sort of operations are incredibly common, I don't expect a candidate to recite to me the textbook definition of computing complexities, but to know the basic properties of one of the mostly used data structures? That is absolutely expected.

28 Feb 2021
07:04 AM

Oren Eini

Olivier,

"Ayende Rahien" is taken from a book (the Wheel of Time), but it does appear to have some Latin context. Completely accidental on my part.

As for "basic knowledge", knowing the properties of Dictionary and List is basic, in my opinion. Maybe not as attractive as the latest react update, but for backend work, that is pretty much the bread and butter.

28 Feb 2021
07:07 AM

Oren Eini

Rafal,

I agree. Junior developer usually means someone that has either completed a CS degree or has equivalent experience / knowledge.

I did some tutoring for high school kids learning to program. Among other things, we are talking age ~15 or so.

They were aware of the use cases of map vs. list and I was working with them on threading and locks work.

I wouldn't hire them for this position, since they lack too much knowledge, but they can go probably go through the initial phone screening process.

Comment preview

Comments have been closed on this topic.

Markdown turns plain text formatting into fancy HTML formatting.

Phrase Emphasis

*italic*   **bold**
_italic_   __bold__

Links

Inline:

An [example](http://url.com/ "Title")

Reference-style labels (titles are optional):

An [example][id]. Then, anywhere
else in the doc, define the link:
  [id]: http://example.com/  "Title"

Images

Inline (titles are optional):

![alt text](/path/img.jpg "Title")

Reference-style:

![alt text][id]
[id]: /url/to/img.jpg "Title"

Headers

Setext-style:

Header 1
========
Header 2
--------

atx-style (closing #'s are optional):

# Header 1 #
## Header 2 ##
###### Header 6

Lists

Ordered, without paragraphs:

1.  Foo
2.  Bar

Unordered, with paragraphs:

*   A list item.
    With multiple paragraphs.
*   Bar

You can nest them:

*   Abacus
    * answer
*   Bubbles
    1.  bunk
    2.  bupkis
        * BELITTLER
    3. burper
*   Cunning

Blockquotes

> Email-style angle brackets
> are used for blockquotes.
> > And, they can be nested.
> #### Headers in blockquotes
> 
> * You can quote a list.
> * Etc.

Horizontal Rules

Three or more dashes or asterisks:

---
* * *
- - - -

Manual Line Breaks

End a line with two or more spaces:

Roses are red,   
Violets are blue.

Fenced Code Blocks

Code blocks delimited by 3 or more backticks or tildas:

```
This is a preformatted
code block
```

Header IDs

Set the id of headings with {#<id>} at end of heading line:

## My Heading {#myheading}

Tables

Fruit    |Color
---------|----------
Apples   |Red
Pears	 |Green
Bananas  |Yellow

Definition Lists

Term 1
: Definition 1
Term 2
: Definition 2

Footnotes

Body text with a footnote [^1]
[^1]: Footnote text here

Abbreviations

MDD <- will have title
*[MDD]: MarkdownDeep

Oren Eini

Oren Eini

CEO of RavenDB