You know too much

time to read 2 min | 255 words

imageJust asked a candidate to do a task that requires an O(1) retrieval semantics.  Off the top of his head, he answered that he would us a Trie.  That is an exciting new development in phone interviews, since this data structure doesn’t come up all too often.

I asked him to explain why he didn’t chose a dictionary as the underlying data structure. His answer was… illuminating.

He explained (in detail, and correctly) how hash maps work, that they rely on probability to ensure their O(1) guarantees and that he didn’t feel that he could select such a data structure for O(1) performance before analyzing the distribution of the data and figuring out if there are too many expected collisions in the data set.

All pretty much text book correct. In fact, if one is writing their own hash map (don’t do that) that is a concern you need to take into account. But in the real world, this is usually a solved problem. There are certain malicious inputs that can get you into trouble and there are certain things you need to take into account if you are writing your own hash function. But for the vast majority of cases and scenarios, you don’t need to worry about this at all.

The candidate is a student in the last stages of his degree and it shows. He gave a wonderful answer, if this was a test. It was just so wonderfully wrong.