Ayende @ Rahien

It's a girl

Hiring Questions–The phone book–responding to commentary

Wow, this post got a lot more attention than I thought it would. Most of it was along the same lines, so I’ll answer it here.

Anyone suggesting, SQLite, Excel, Access, Esent, Embedded RavenDB, Munin, Embedded FireBird, MS SQL CE, DB4O or anything like it – that isn’t the purpose of the question. I am not trying to figure out if the candidate knows about embedded databases.

Performance isn’t much of a concern, we expect up to several thousands entries per phonebook, but not beyond that, and speed should be in the human response time range (hundreds of milliseconds).

I rejected JSON / XML file formats because I wanted to make the task harder than just using the built-in Linq API and serializing to a file. 

Out of this question, I want actual code that I can try out, not just some high level design. I estimate that if you know what you are doing, this should take less than half an hour. At high school, I think it took about two hours, and that was in unmanaged land.

Some people questioned what is the purpose of this question, under what scenarios is it valid, etc.

Put simply, it is valid because it tests a wide range of topics in a candidate abilities. I don’t feel that I need to go into world building to setup a scenario for an interview question.

Mike McG put it beautifully:

He spells out the requirements for a basic database engine with indexing. It's a multifaceted problem that can expose a lot about a candidate in their solution. Are concerns separated logically? How is performance addressed against disk I/O? How is code correctness validated (e.g. testing)? To submit a project that just wraps another database is disingenuous. He's looking for people that can solve problems head-on, not just pass the buck.

Exactly. More to the point, it forces a candidate to actually do a fairly complex task that still can be done in a short amount of time. It shows me how they think, whatever they have any idea how computers actually work. If you can’t complete this task, you don’t understand basic file IO. That means that you might be a great front end developer, but I need someone who can do more than that.

Comments

Tim Murphy
10/03/2011 11:52 AM by
Tim Murphy

Is this the first time you have used this interview question?

Did you try to solve it yourself? Dud you get your current staff to solve it? And I think most importantly dud you time the results.

My point is I think all developers suck at estimating how long a task will take. Doing the task yourself, better still your staff, would help you learn how much time to give an interviewee and what information might be missing from the question

Daniel Lidström
10/03/2011 02:50 PM by
Daniel Lidström

I gave an interview question (homework actually) once. It was tailored for a specific person who claimed to be an excellent programmer. The task was made to show if he had good design skills (a few patterns, a class hierarchy). Writing or demonstrating code should really be a requirement during all interviews, in my opinion too.

Matthew Shapiro
10/03/2011 02:54 PM by
Matthew Shapiro

The most fun job interview I've ever had is at my current place, where they gave me a client export and told me to design a system to parse the export file (it wasn't a simple export either).

I didn't have to write out code, but I did have to explain the data structures I would use and the interface I used.

It was refreshing to have a good coding interview instead of the usual barrage of questions.

Karep
10/03/2011 06:13 PM by
Karep

I'd love to see you implement it in half an hour. I'm not saying you can't do it. I just would like to watch how you do it

Rafal
10/03/2011 07:48 PM by
Rafal

Good programmer don't have to be quick at typing. Quite often they can solve the problem without coding at all.

tobi
10/03/2011 07:51 PM by
tobi

This is possible in half an hour without any problems but not at the level of quality that you would like to show off to your future employer.

For example, my solution would have been to use protocol buffers for the binary format. The format needs to have versioning built-in so that new fields can be added. That makes protocol buffers well suited. It is unclear how fast you can pull that off if you have to have automated tests and so on.

Fred
10/03/2011 08:21 PM by
Fred

A programmer that is good at designing something fairly low-level like an API for some database with some arbitrary requirements like "can't use JSON or XML" isn't necessarily going to be a programmer that's good at designing the components of the business logic layer or the user interface. I see little value in testing somebody's ability to design a b-tree index from scratch. I'd rather test a programmer's resourcefulness in implementing something that already exists--like LINQ to XML or whatever.

In real-life, nobody is going to task a single programmer with developing a database from scratch. That's something that's going take very detailed specs that an architect or a team is going decide on before it even gets to a developer.

Sure, this could be a test in one's ability to solve problems, but part of solving problems should be to determine if the problem has already been solved by somebody else. When you have a team of developers, the last thing you want is for everybody to individually reinvent the wheel without considering if somebody already designed a wheel.

Daniel
10/03/2011 08:41 PM by
Daniel

Wow. Writing working code for a small database engine with support for indexing in half an hour. How to decide which aspects to focus on:

  • Code organisation
  • IO performance
  • Versioning
  • Threading
  • File formats

etc. etc. Just look at the number of questions it generated.

That is a really hard assignment in my opinion, not only a fairly complex one. If I was assigned this task during an interview I fear I would fail big time - not knowing what to focus on and what to ignore. Luckily this forum seems to be filled with excellent super programmers that can work for ayende.

Xing Yang
10/03/2011 09:58 PM by
Xing Yang

In half an hour it's probably enough to come up with a working program, or an elegant design, or a solid testing suite, but not all of them I'm afraid.

Rafal
10/04/2011 06:08 AM by
Rafal

Daniel, did you see any requirement to write a database sever? You don't need threading, versioning, file formats, io performance or b-trees. What you need is a flat file with text entries and an index - OrderedDictionary should do. Half an hour is enough to have a working prototype.

Ayende Rahien
10/04/2011 06:13 AM by
Ayende Rahien

Fred, In our space, that is actually something that we do, you do realize that? I would be very surprised if someone would write me a B-Tree in that time frame, too. What I am doing here is check how they deal with a fairly straightforward task, but one that requires you to handle a lot of stuff. API design, several methods, maybe a few classes. It is big enough to look at your style, small enough to be completed quickly.

Ayende Rahien
10/04/2011 06:15 AM by
Ayende Rahien

Daniel, In a limited amount of time, you are going to need to focus on something fairly narrow to get the specified requirements done. If you are doing threading here, you failed. If you are doing something that requires you to think about IO performance, you failed.

File formats/Versioning isn't a concerned for something that you write in half an hour to an hour. No one expects to have everything done in that time frame with those.

Code organization is something that I really care about here, though

Daniel
10/04/2011 11:23 AM by
Daniel

I still think the question is too wide and unspecific. Obviously, you have to narrow it down somehow in order to solve it in such a short time frame. My first reaction, along with a couple of others, was to use an embedded database such as SQLite. In my opinion this is a good solution since it fullfills the specification and it allows us to keep things very simple. However, when clarified, this was not the purpose of the question. I hope you notice that something went wrong here already, indicating the diffuseness of the question.

When rethinking the problem in this new light I thought about just deserializing objects from disk at startup and then serializing the new state during application shutdown - doing all the other work using an in-memory data structure. However, i felt that this might not either be the purpose of the question. Especially, when reading the quote from Mike McG:

"He spells out the requirements for a basic database engine with indexing. It's a multifaceted problem that can expose a lot about a candidate in their solution. Are concerns separated logically? How is performance addressed against disk I/O? How is code correctness validated (e.g. testing)? To submit a project that just wraps another database is disingenuous. He's looking for people that can solve problems head-on, not just pass the buck."

Here the words database engine and indexing catches my eye.

The quotation also, clearly states, that disk I/O performance is interesting even though it is contradicted by the statement (information that was not a part of the original question):

"Performance isn’t much of a concern, we expect up to several thousands entries per phonebook, but not beyond that, and speed should be in the human response time range (hundreds of milliseconds)."

Another thing that makes me uncertain how to narrow it down is the claim that it should be "a reusable library". Interesting to say the least.

It is indeed a hard task to invent and express good questions; sometimes harder than actually finding out the answer.

Jay R. Wren
10/05/2011 03:18 PM by
Jay R. Wren

I would have NEVER guessed that you were going for file IO and indexing as things for the interview candidate to bring up in this.

You need to set more constraints on this problem to get to there.

No XML or JSON is cute, but I'd still read the whole file into an object graph and use LINQ. Probably Binary Serializer given the no XML/JSON. And for 10k entries it would be fast on iphone and androids too.

Suggestion for an added constraint: You are on a memory constrained device and you want the in memory use of phone book entries to never been more than 500bytes.

Karim
10/10/2011 12:15 PM by
Karim

Ayende, The more I read your recruitment-related posts and comments, the more I think that you still have A LOT (!!!) yet to learn on how to hire great software engineers. You're simply doing it wrong.

  1. You're asking your candidate to come up with his own storage format, and you're not concerned with the performance? What's the point in that? You're not even expressing any requirements towards that format: does it has to be human-readable, fast, standards-based (obviously not) or what? It makes little sense if any.

  2. You're asking somebody to come up with his own storage format, store the data in FILES and then state "If you are doing something that requires you to think about IO performance, you failed.". Well, if this is how you design databases... :) YET! you completely agree with Mike McG who mentioned IO performance/caching/indexes.

  3. If the only think that you care about is core organization, then indeed you're looking for front-end developers or architecture astronauts (http://www.joelonsoftware.com/articles/fog0000000018.html). As there is nothing special about writing a system that will hold all it's data in-memory or a system that would take O(n) to search through file's content (if you don't want to keep in memory and want a working code in 30 minutes) - even high-school students can do that - and I'm not surprised that you did that back then :)

Comments have been closed on this topic.