Ayende @ Rahien

Hi!
My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:

ayende@ayende.com

+972 52-548-6969

, @ Q c

Posts: 6,125 | Comments: 45,488

filter by tags archive

Interview challenges: Decompress that

time to read 1 min | 137 words

I’m always looking for additional challenges that I can ask people who interview at Hibernating Rhinos, and I run into an interesting challenge just now.

Implement a Decompress(Stream input, Stream output, byte[][] dictionary) routine for the following protocol:

Prefix byte – composed of the highest bit + 7 bits len

If the highest bit is not set, this is a uncompressed data, copy the next len from the input to the output.

If the highest bit is set, this is compressed data, the next byte will be the index in the dictionary, copy len bytes from that dictionary entry to the output.

After writing the code, the next question is going to be, what are the implication of this protocol? What is it going to be good for? What is it going to be bad for?


Comments

Jake

Interesting. Do you ask candidates to solve this whilst you are watching them, or do you allow them the freedom to solve the problems in their own comfortable environment with their preferred tools?

Jahmai Lay

@Jake I imagine it's something like this: http://y2u.be/rUY8HysBzsE

Oğuzhan Eren

http://en.wikipedia.org/wiki/Dictionary_coder ?

Pop Catalin

I think this kind of compression whould be good for Json and XML documents, and in a typical Json or XML document the members names are higly repeated and not too many (so 127 dictionary entries limit would be pretty good, also max size of 255 for an entry (256 if you modify the alghoritm to copy len+1 bytes) is prety good for this use case.

I think it gets a good tradeof between speed (very fast) and size (good for certain documents types)

Ayende Rahien

Jake, No, I usually let them either do that on a separate room with a development machine or they do it at home

Stefan

Resembles my WBXML work i did for telco industry >8 years ago ;)

Comment preview

Comments have been closed on this topic.

FUTURE POSTS

  1. The design of RavenDB 4.0: Physically segregating collections - 14 hours from now
  2. RavenDB 3.5 Whirlwind tour: I need to be free to explore my data - about one day from now
  3. RavenDB 3.5 whirl wind tour: I'll have the 3+1 goodies to go, please - 5 days from now
  4. The design of RavenDB 4.0: Voron has a one track mind - 6 days from now
  5. RavenDB 3.5 whirl wind tour: Digging deep into the internals - 7 days from now

And 12 more posts are pending...

There are posts all the way to May 30, 2016

RECENT SERIES

  1. The design of RavenDB 4.0 (14):
    03 May 2016 - Making Lucene reliable
  2. RavenDB 3.5 whirl wind tour (14):
    04 May 2016 - I’ll find who is taking my I/O bandwidth and they SHALL pay
  3. Tasks for the new comer (2):
    15 Apr 2016 - Quartz.NET with RavenDB
  4. Code through the looking glass (5):
    18 Mar 2016 - And a linear search to rule them
  5. Find the bug (8):
    29 Feb 2016 - When you can't rely on your own identity
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats