Ayende @ Rahien

It's a girl

Picture of the day, Rhino & Raven

We have an amateur photographer in the office, who like to arrange the rhinos in the office (now exceeding 100, I think) in various ways.

I really like this picture.

Published at

Originally posted at

Comments (2)

Crunching the numbers

I’m running the numbers, and I didn’t really believe them. So I put them in a chart, and I’m still not sure that I trust them.

Here you can see our income per month for the past year. Yes, July was pretty bad with everyone on vacation and no one is buying.

March 14 was pretty good, with people buying RavenConf tickets are a lot of excitement around that. September and October were great. In fact, October was out best month ever.

And then we get to February 2015. In other words, this month. In other words, this month where we still have 10 more days to make sales.


I’m pretty happy Smile. But I wished that I had hard numbers on why.

Exercise, learning and sharpening skills

The major portion of the RavenDB core team is co-located in our main office in Israel. That means that we get to do things like work together, throw ideas off one another, etc.

But one of the things that I found stifling in other work places is a primary focus on the current work. I mean, the work is important, but one of the best ways to stagnate is to keep doing the same thing over and over again. Sure, you are doing great and can spin that yarn very well, but there is always that power loom that is going to show up any minute. So we try to do some skill sharpening.

There are a bunch of ways we do that. We try to have a lecture every week (given by one of our devs). The topics so far range from graph theory to distributed gossip algorithms to new frameworks and features that we deal with to the structure of the CPU and memory architecture.

We also have mini debuggatons. I’m not sure if the name fit, but basically, we show a particular problem, split into pairs and try to come up with a root cause and a solution. This post is written while everyone else in the office is busy looking at WinDBG and finding a memory leak issue, in the meantime, I’m posting to the blog and fixing bugs.

Reading Habits

I’m reading a lot, and I thought that I would post a bit about my favorite subjects. I decided to summarize this year with great books that don’t really fall into standard categories, which I really enjoyed.

The AlterWorld – By a Russian author, and with a great background there (how to identify a Russian was great), and are really good. The premise is that you can get stuck in a MMORPG and it is beautifully done. Unlike a fantasy book, the notion of levels, gaining strength and power is really nice. Especially since the hero isn’t actually taking the direct path to that. There is also a lot of interaction with the real world, and in general, this is a fully featured universe that is really good. It looks like there are going to be 3 more books, which is absolutely wonderful from my point of view.

AlterWorld The Clan The Duty

Those books were good enough that I started playing RPGs again, just because it was so much fun reading the status messages in the books. If you know of other books in the same space, I would love to know about it.

NPCs tells the tale from the point of view of Non Player Characters, which is quite interesting and done in a very believable way.


Caverns & Creatures is a series of books (lost count, there are a lot of short stories as well as full length books) that deals with the idea of people getting stuck in RPG world. This one is mostly meant for humor’s sake, I think. And it does get to toilet level humor all too frequently, but it is entertaining.

Critical Failures

Waldo Rabbit tells the tale of a guy that really tries to be an evil overload, but his idea of scary beast is a… rabbit. It is a really well written, and I’m looking forward for the next book.

The (sort of) Dark Mage (Wa... After The Rabbit (Waldo Rab...
Wizard 2.0 talks about finding proof that the entire world is a computer simulation, and what happens when certain people find out about it. My guess is that this is written by a programmer, because the parts where they talk about software and programming wasn’t made up in whole cloth and didn’t piss me off at all. This is also really good series, and I’m looking forward to reading the 3rd book.  I especially liked that there isn’t some big Save The World theme going on, this is just life as you know it, if you are a bunch of pixels.

Off to Be the Wizard (Magic... Spell or High Water (Magic ... An Unwelcome Quest (Magic 2...

Velveteen is a “superhero” novel, but a very different one than the usual one. I’m not really sure how to categorize it, but it was a really great read.

Velveteen vs. The Junior Su... Velveteen vs. The Multivers...

Daniel Black’s is a single book series, with a second book, Black Coven set to follow Fimbulwinter. It is a great book, with a very well written background and story. What is more, the hero doesn’t rely on brute force or the author to rescue him when he stupidly gets into trouble, he thinks and plans, and that is quite great to read. I’m eagerly waiting for the next book.

Fimbulwinter (Daniel Black #1)

Published at

Originally posted at

Comments (4)

The holidays, plans & what is next

So, we are done with the holidays here. The last month was basically very little work, because a lot of our people were out for the holidays.

Internally, we are gearing up to finish the website for RavenDB 3.0, while another part of the team is focused on stability and performance. We just hired another new guy, and he is going to be working pretty much on distribution from now on. I’ll report more on that in a few weeks.

Looking at the blog, I’ve mostly been talking about RavenDB, and I want to do a small shift and talk about other topics, so I’m declaring the next two weeks to be RavenDB free weeks. I’m going to continue to blog regularly, of course, but I’m going to be talking about other topics for a change.

Don’t worry, we haven’t stopped working on RavenDB, it is just that it is pretty boring to hear about things like test clusters, or how we work on issues from the people trying out the RC builds.

Published at

Originally posted at

It is a good day, celebrate it

It is a good day, so I decided to share some joy.

For today only, we offer 21% discount for all our products. You can get that using coupon code: bzeiglglay

This applies to RavenDB (Standard, Enterprise and ISV), RavenDB Professional & Production Support and NHibernate Profiler and Entity Framework Profiler

This offer will be valid for 24 hours only.

Published at

Originally posted at

Comments (4)

Support Triage

We got a call to the office.

We have a huge problem with our system, you need to come and help us right away. This is a critical system and we need immediate response.

That kinda of annoying, of course, but it is all part of the service. So, in order to log the appropriate items into our system, we asked:

What is your order id? And what is your support contract number?

And the answer was:

Oh, that was handled by another department, I’m not sure.

So we asked them to figure that out and send it to us, and waited. The call came at noon, but 7 PM, I sent them an email.

The reply I got back was:

We’ll try to find the order details tomorrow.

I guess it isn’t so huge, critical and immediate problem any longer…

Published at

Originally posted at

Comments (5)

The downsides of going big

One of the things that I found out is that as Hibernating Rhinos grows (and we currently have over a dozen people working full time), I’m seeing two very interesting changes in my own behavior.

The actual velocity is increasing by leaps & bounds. We can do a lot more now, and we can do that faster and with a greater degree of parallelism.

My personal development is growing less, as I am doing a lot more of business type things. One aspect of that is that I do a lot of reading of contracts, legalese, and other stuff that takes time from actual development work.

I try to compensate by running the tests while I’m reading contracts, and then I found the following in a contract I’m reviewing:


Nice to know where that stand. And I emphasize.

Published at

Originally posted at

Comments (8)

FAIL can impress, too

I mentioned that a quick way to setup things for me to think that a candidate is a bad idea is to send us a UI project. This is usually a very strong indication that the candidate doesn’t really have any idea what they are doing. They have been doing Win Forms projects, so they write the code for the task at hand in buttom1_Click event handler. Or inside the Page_Load code in an ASP.Net WebForms application if they are “web developers”.

On the other hand, here is a strong counter example. We had a candidate send in a WinForms project, as I said, that is usually a bad sign. But then I actually looked at his code:


And here is a single method:


This code is several levels of too complex for the task.  It can be easily simplified to a great degree quite easily.

But the key point from this, and the reason that this candidate has an interview later this week, is that this demonstrate a bunch of things:

  • Understanding of separation of concerns.
  • Code that actually does what it is supposed to do.
  • Proper integration between UI & backend code (for example, we are working with large files, so we have progress bars and off-the-UI-thread work).
  • The UI doesn’t look like it was put together by a hiccupping monkey.

I can work with this. There are things that need to be improved for what we do, but there appears to be a SOLID foundation here.

Published at

Originally posted at

Comments (8)

Fail, fail, fail

Sometimes, reading candidates answer is just something that I know is going to piss me off.

We have a question that goes something like this (the actual question is much more detailed):

We have a 15TB csv file that contains web log, the entries are sorted by date (since this is how they were entered). Find all the log entries within a given date range. You may not read more than 32 MB.

A candidate replied with an answered that had the following code:

   1: string line = string.Empty;
   2: StreamReader file;
   4: try
   5: {
   6:     file = new StreamReader(filename);
   7: }
   8: catch (FileNotFoundException ex)
   9: {
  10:     Console.WriteLine("The file is not found.");
  11:     Console.ReadLine();
  12:     return;
  13: }
  15: while ((line = file.ReadLine()) != null)
  16: {
  17:     var values = line.Split(',');
  18:     DateTime date = Convert.ToDateTime(values[0]);
  19:     if (date.Date >= startDate && date.Date <= endDate)
  20:         output.Add(line);
  22:     // Results size in MB
  23:     double size = (GetObjectSize(output) / 1024f) / 1024f;
  24:     if (size >= 32)
  25:     {
  26:         Console.WriteLine("Results size exceeded 32MB, the search will stop.");
  27:         break;
  28:     }
  29: }

My reply was:

The data file is 15TB in size, if the data is beyond the first 32MB, it won't be found.

The candidate then fixed his code. It now includes:

   1: var lines = File.ReadLines(filename);

Yep, this is on a 15TB file.

Now I’m going to have to lie down for a bit, I am not feeling so good.

Quick FAILs in code questions

Sometimes it takes very little time to know that a candidate is going to be pretty horrible. As you can probably guess, the sort of questions we ask tend to be “find me this data in this sort of file”.

Probably the fastest indication is when they send me projects like this:



Now, it is possible that someone skilled will send us real projects like that, but the experience so far has been that this isn’t going to be the case. If you have someone sending a UI project, it usually indicates that they can’t think about it in any other way.

The code they send pretty much justify this concern. Some code snippets from those projects:


Yup, this is the kind of error handling I want to see. Just for fun, if there hasn’t been an error, this function would return a comma separated string of values.

Which make it just slightly worse than:


And then we have this:


I guess someone really like O(N**2) on 15 TB files.

And then there is this:


I guess we have different definitions on what configurable means.

And then there was this person:


Yes, they did send me code inside a PDF file. That was the only way that they could find to send code around, I’m guessing.

The cost of working with strings

Following my last post, I decided that it might be better to actually show what the difference is between direct string manipulation and working at lower levels.

I generated a sample CSV file with 10 million lines and 6 columns. The file size was 658MB. I then wrote the simplest code that I could possibly think of:

   1: public class TrivialCsvParser
   2: {
   3:     private readonly string _path;
   5:     public TrivialCsvParser(string path)
   6:     {
   7:         _path = path;
   8:     }
  10:     public IEnumerable<string[]> Parse()
  11:     {
  12:         using (var reader = new StreamReader(_path))
  13:         {
  14:             while (true)
  15:             {
  16:                 var line = reader.ReadLine();
  17:                 if (line == null)
  18:                     break;
  19:                 var fields = line.Split(',');
  20:                 yield return fields;
  21:             }
  22:         }
  23:     }
  24: }

This run in 8.65 seconds (with a no-op action) and kept the memory utilization at about 7MB.

Then next thing to try was just reading through the file without doing any parsing. So I wrote this:

   1: public class NoopParser
   2: {
   3:     private readonly string _path;
   5:     public NoopParser(string path)
   6:     {
   7:         _path = path;
   8:     }
  10:     public IEnumerable<object> Parse()
  11:     {
  12:         var buffer = new byte[1024];
  13:         using (var stream = new FileStream(_path,FileMode.Open, FileAccess.Read))
  14:         {
  15:             while (true)
  16:             {
  17:                 var result = stream.Read(buffer, 0, buffer.Length);
  18:                 if (result == 0)
  19:                     break;
  20:                 yield return null; // noop
  21:             }
  22:         }
  23:     }
  24: }

Note that this isn’t actually doing anything. But this took 0.83 seconds, so we see a pretty important big difference here. By the way, the amount of memory used isn’t noticeably different here. Both use about 7 MB. Probably because we aren’t actually holding up to any of the data in any meaningful way.

I have run the results using release build, and I run it multiple times, so the file is probably all in the OS cache. So I/O cost is pretty minimal here. However, note that we aren’t doing a lot of stuff that is being done by the TrivialCsvParser. For example, doing line searches, splitting the string to fields, etc. But interestingly enough, just removing the split will reduce the cost from 8.65 seconds to 3.55 seconds.

I’m twenty tomorrow, let us celebrate

Well, tomorrow I’ll be 0x20. Leaving aside the fact that I am just entering my twenties (finally, it feels like I was a 0xTeenager for over a decade), there is the tradition to uphold.

Therefor, we have a 32% discount until the end of the year*.

You can use coupon code: 0x20-twentysomething

This applies to:

* Limited to the first 0x32, and not applicable if you have to ask why you get a 32% discount.

Evil interview questions: Unique & Random C

Writing in C (and using only the C std lib as building blocks, which explicitly exclude C++ and all its stuff), generate 1 million unique random numbers.

For reference, here is the code in C#:

   1: var random = new Random();
   2: var set = new HashSet<int>();
   4: var sp = Stopwatch.StartNew();
   6: while (set.Count < 1000 * 1000)
   7: {
   8:     set.Add(random.Next(0, int.MaxValue));
   9: }
  11: Console.WriteLine(sp.ElapsedMilliseconds);

It is a brute force approach, I’ll admit, but it completes in about 150ms on my end.  Solution must run in under 10 seconds.

This question just looks stupid, it actually can tell you quite a bit about the developer.

JSON Packing, Text Based Formats and other stuff that come to mind at 5 AM

This post was written at 5:30AM, I run into this while doing research for another post, and I couldn’t really let it go.

XML as a text base format is really wasteful in space. But that wasn’t what really made it lose its shine. That was when it became so complex that it stopped being human readable. For example, I give you:

   1: <?xml version="1.0" encoding="UTF-8" ?>
   2:  <SOAP-ENV:Envelope
   3:   xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance"
   4:   xmlns:xsd="http://www.w3.org/1999/XMLSchema"
   5:   xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
   6:    <SOAP-ENV:Body>
   7:        <ns1:getEmployeeDetailsResponse
   8:         xmlns:ns1="urn:MySoapServices"
   9:         SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
  10:            <return xsi:type="ns1:EmployeeContactDetail">
  11:                <employeeName xsi:type="xsd:string">Bill Posters</employeeName>
  12:                <phoneNumber xsi:type="xsd:string">+1-212-7370194</phoneNumber>
  13:                <tempPhoneNumber
  14:                 xmlns:ns2="http://schemas.xmlsoap.org/soap/encoding/"
  15:                 xsi:type="ns2:Array"
  16:                 ns2:arrayType="ns1:TemporaryPhoneNumber[3]">
  17:                    <item xsi:type="ns1:TemporaryPhoneNumber">
  18:                        <startDate xsi:type="xsd:int">37060</startDate>
  19:                        <endDate xsi:type="xsd:int">37064</endDate>
  20:                        <phoneNumber xsi:type="xsd:string">+1-515-2887505</phoneNumber>
  21:                    </item>
  22:                    <item xsi:type="ns1:TemporaryPhoneNumber">
  23:                        <startDate xsi:type="xsd:int">37074</startDate>
  24:                        <endDate xsi:type="xsd:int">37078</endDate>
  25:                        <phoneNumber xsi:type="xsd:string">+1-516-2890033</phoneNumber>
  26:                    </item>
  27:                    <item xsi:type="ns1:TemporaryPhoneNumber">
  28:                        <startDate xsi:type="xsd:int">37088</startDate>
  29:                        <endDate xsi:type="xsd:int">37092</endDate>
  30:                        <phoneNumber xsi:type="xsd:string">+1-212-7376609</phoneNumber>
  31:                    </item>
  32:                </tempPhoneNumber>
  33:            </return>
  34:        </ns1:getEmployeeDetailsResponse>
  35:    </SOAP-ENV:Body>
  36: /SOAP-ENV:Envelope>

After XML was thrown out of the company of respectable folks, we had JSON show up and entertain us. It is smaller and more concise than XML, and so far has resisted the efforts to make it into some sort of a uber complex enterprisiey tool.

But today I run into quite a few effort to do strange things to JSON. I am talking about things like JSON DB (a compressed json format, not actual json database), JSONH, json.hpack, and friends. All of those attempt to reduce the size of JSON documents.

Let us take an example. the following is a JSON document representing one of RavenDB builds:

   1: {
   2:   "BuildName": "RavenDB Unstable v2.5",
   3:   "IsUnstable": true,
   4:   "Version": "2509-Unstable",
   5:   "PublishedAt": "2013-02-26T12:06:12.0000000",
   6:   "DownloadsIds": [],
   7:   "Changes": [
   8:     {
   9:       "Commiter": {
  10:         "Email": "david@davidwalker.org",
  11:         "Name": "David Walker"
  12:       },
  13:       "Version": "17c661cb158d5e3c528fe2c02a3346305f0234a3",
  14:       "Href": "/app/rest/changes/id:21039",
  15:       "TeamCityId": 21039,
  16:       "Username": "david walker",
  17:       "Comment": "Do not save Has-Api-Key header to metadata\n",
  18:       "Date": "2013-02-20T23:22:43.0000000",
  19:       "Files": [
  20:         "Raven.Abstractions/Extensions/MetadataExtensions.cs"
  21:       ]
  22:     },
  23:     {
  24:       "Commiter": {
  25:         "Email": "david@davidwalker.org",
  26:         "Name": "David Walker"
  27:       },
  28:       "Version": "5ffb4d61ad9102696948f6678bbecac88e1dc039",
  29:       "Href": "/app/rest/changes/id:21040",
  30:       "TeamCityId": 21040,
  31:       "Username": "david walker",
  32:       "Comment": "Do not save IIS Application Request Routing headers to metadata\n",
  33:       "Date": "2013-02-20T23:23:59.0000000",
  34:       "Files": [
  35:         "Raven.Abstractions/Extensions/MetadataExtensions.cs"
  36:       ]
  37:     },
  38:     {
  39:       "Commiter": {
  40:         "Email": "ayende@ayende.com",
  41:         "Name": "Ayende Rahien"
  42:       },
  43:       "Version": "5919521286735f50f963824a12bf121cd1df4367",
  44:       "Href": "/app/rest/changes/id:21035",
  45:       "TeamCityId": 21035,
  46:       "Username": "ayende rahien",
  47:       "Comment": "Better disposal\n",
  48:       "Date": "2013-02-26T10:16:45.0000000",
  49:       "Files": [
  50:         "Raven.Client.WinRT/MissingFromWinRT/ThreadSleep.cs"
  51:       ]
  52:     },
  53:     {
  54:       "Commiter": {
  55:         "Email": "ayende@ayende.com",
  56:         "Name": "Ayende Rahien"
  57:       },
  58:       "Version": "c93264e2a94e2aa326e7308ab3909aa4077bc3bb",
  59:       "Href": "/app/rest/changes/id:21036",
  60:       "TeamCityId": 21036,
  61:       "Username": "ayende rahien",
  62:       "Comment": "Will ensure that the value is always positive or zero (never negative).\nWhen using numeric calc, will div by 1,024 to get more concentration into buckets.\n",
  63:       "Date": "2013-02-26T10:17:23.0000000",
  64:       "Files": [
  65:         "Raven.Database/Indexing/IndexingUtil.cs"
  66:       ]
  67:     },
  68:     {
  69:       "Commiter": {
  70:         "Email": "ayende@ayende.com",
  71:         "Name": "Ayende Rahien"
  72:       },
  73:       "Version": "7bf51345d39c3993fed5a82eacad6e74b9201601",
  74:       "Href": "/app/rest/changes/id:21037",
  75:       "TeamCityId": 21037,
  76:       "Username": "ayende rahien",
  77:       "Comment": "Fixing a bug where we wouldn't decrement reduce stats for an index when multiple values from the same bucket are removed\n",
  78:       "Date": "2013-02-26T10:53:01.0000000",
  79:       "Files": [
  80:         "Raven.Database/Indexing/MapReduceIndex.cs",
  81:         "Raven.Database/Storage/Esent/StorageActions/MappedResults.cs",
  82:         "Raven.Database/Storage/IMappedResultsStorageAction.cs",
  83:         "Raven.Database/Storage/Managed/MappedResultsStorageAction.cs",
  84:         "Raven.Tests/Issues/RavenDB_784.cs",
  85:         "Raven.Tests/Storage/MappedResults.cs",
  86:         "Raven.Tests/Views/ViewStorage.cs"
  87:       ]
  88:     },
  89:     {
  90:       "Commiter": {
  91:         "Email": "ayende@ayende.com",
  92:         "Name": "Ayende Rahien"
  93:       },
  94:       "Version": "ff2c5b43eba2a8a2206152658b5e76706e12945c",
  95:       "Href": "/app/rest/changes/id:21038",
  96:       "TeamCityId": 21038,
  97:       "Username": "ayende rahien",
  98:       "Comment": "No need for so many repeats\n",
  99:       "Date": "2013-02-26T11:27:49.0000000",
 100:       "Files": [
 101:         "Raven.Tests/Bugs/MultiOutputReduce.cs"
 102:       ]
 103:     },
 104:     {
 105:       "Commiter": {
 106:         "Email": "ayende@ayende.com",
 107:         "Name": "Ayende Rahien"
 108:       },
 109:       "Version": "0620c74e51839972554fab3fa9898d7633cfea6e",
 110:       "Href": "/app/rest/changes/id:21041",
 111:       "TeamCityId": 21041,
 112:       "Username": "ayende rahien",
 113:       "Comment": "Merge branch 'master' of https://github.com/cloudbirdnet/ravendb into 2.1\n",
 114:       "Date": "2013-02-26T11:41:39.0000000",
 115:       "Files": [
 116:         "Raven.Abstractions/Extensions/MetadataExtensions.cs"
 117:       ]
 118:     }
 119:   ],
 120:   "ResolvedIssues": [],
 121:   "Contributors": [
 122:     {
 123:       "FullName": "Ayende Rahien",
 124:       "Email": "ayende@ayende.com",
 125:       "EmailHash": "730a9f9186e14b8da5a4e453aca2adfe"
 126:     },
 127:     {
 128:       "FullName": "David Walker",
 129:       "Email": "david@davidwalker.org",
 130:       "EmailHash": "4e5293ab04bc1a4fdd62bd06e2f32871"
 131:     }
 132:   ],
 133:   "BuildTypeId": "bt8",
 134:   "Href": "/app/rest/builds/id:588",
 135:   "ProjectName": "RavenDB",
 136:   "TeamCityId": 588,
 137:   "ProjectId": "project3",
 138:   "Number": 2509
 139: }

This document is 4.52KB in size. Running this through JSONH gives us the following:

   1: [
   2:     14,
   3:     "BuildName",
   4:     "IsUnstable",
   5:     "Version",
   6:     "PublishedAt",
   7:     "DownloadsIds",
   8:     "Changes",
   9:     "ResolvedIssues",
  10:     "Contributors",
  11:     "BuildTypeId",
  12:     "Href",
  13:     "ProjectName",
  14:     "TeamCityId",
  15:     "ProjectId",
  16:     "Number",
  17:     "RavenDB Unstable v2.5",
  18:     true,
  19:     "2509-Unstable",
  20:     "2013-02-26T12:06:12.0000000",
  21:     [
  22:     ],
  23:     [
  24:         {
  25:             "Commiter": {
  26:                 "Email": "david@davidwalker.org",
  27:                 "Name": "David Walker"
  28:             },
  29:             "Version": "17c661cb158d5e3c528fe2c02a3346305f0234a3",
  30:             "Href": "/app/rest/changes/id:21039",
  31:             "TeamCityId": 21039,
  32:             "Username": "david walker",
  33:             "Comment": "Do not save Has-Api-Key header to metadata\n",
  34:             "Date": "2013-02-20T23:22:43.0000000",
  35:             "Files": [
  36:                 "Raven.Abstractions/Extensions/MetadataExtensions.cs"
  37:             ]
  38:         },
  39:         {
  40:             "Commiter": {
  41:                 "Email": "david@davidwalker.org",
  42:                 "Name": "David Walker"
  43:             },
  44:             "Version": "5ffb4d61ad9102696948f6678bbecac88e1dc039",
  45:             "Href": "/app/rest/changes/id:21040",
  46:             "TeamCityId": 21040,
  47:             "Username": "david walker",
  48:             "Comment": "Do not save IIS Application Request Routing headers to metadata\n",
  49:             "Date": "2013-02-20T23:23:59.0000000",
  50:             "Files": [
  51:                 "Raven.Abstractions/Extensions/MetadataExtensions.cs"
  52:             ]
  53:         },
  54:         {
  55:             "Commiter": {
  56:                 "Email": "ayende@ayende.com",
  57:                 "Name": "Ayende Rahien"
  58:             },
  59:             "Version": "5919521286735f50f963824a12bf121cd1df4367",
  60:             "Href": "/app/rest/changes/id:21035",
  61:             "TeamCityId": 21035,
  62:             "Username": "ayende rahien",
  63:             "Comment": "Better disposal\n",
  64:             "Date": "2013-02-26T10:16:45.0000000",
  65:             "Files": [
  66:                 "Raven.Client.WinRT/MissingFromWinRT/ThreadSleep.cs"
  67:             ]
  68:         },
  69:         {
  70:             "Commiter": {
  71:                 "Email": "ayende@ayende.com",
  72:                 "Name": "Ayende Rahien"
  73:             },
  74:             "Version": "c93264e2a94e2aa326e7308ab3909aa4077bc3bb",
  75:             "Href": "/app/rest/changes/id:21036",
  76:             "TeamCityId": "...bug where we wouldn't decrement reduce stats for an index when multiple values from the same bucket are removed\n",
  77:             "Date": "2013-02-26T10:53:01.0000000",
  78:             "Files": [
  79:                 "Raven.Database/Indexing/MapReduceIndex.cs",
  80:                 "Raven.Database/Storage/Esent/StorageActions/MappedResults.cs",
  81:                 "Raven.Database/Storage/IMappedResultsStorageAction.cs",
  82:                 "Raven.Database/Storage/Managed/MappedResultsStorageAction.cs",
  83:                 "Raven.Tests/Issues/RavenDB_784.cs",
  84:                 "Raven.Tests/Storage/MappedResults.cs",
  85:                 "Raven.Tests/Views/ViewStorage.cs"
  86:             ]
  87:         },
  88:         {
  89:             "Commiter": {
  90:                 "Email": "ayende@ayende.com",
  91:                 "Name": "Ayende Rahien"
  92:             },
  93:             "Version": "ff2c5b43eba2a8a2206152658b5e76706e12945c",
  94:             "Href": "/app/rest/changes/id:21038",
  95:             "TeamCityId": 21038,
  96:             "Username": "ayende rahien",
  97:             "Comment": "No need for so many repeats\n",
  98:             "Date": "2013-02-26T11:27:49.0000000",
  99:             "Files": [
 100:                 "Raven.Tests/Bugs/MultiOutputReduce.cs"
 101:             ]
 102:         },
 103:         {
 104:             "Commiter": {
 105:                 "Email": "ayende@ayende.com",
 106:                 "Name": "Ayende Rahien"
 107:             },
 108:             "Version": "0620c74e51839972554fab3fa9898d7633cfea6e",
 109:             "Href": "/app/rest/changes/id:21041",
 110:             "TeamCityId": 21041,
 111:             "Username": "ayende rahien",
 112:             "Comment": "Merge branch 'master' of https://github.com/cloudbirdnet/ravendb into 2.1\n",
 113:             "Date": "2013-02-26T11:41:39.0000000",
 114:             "Files": [
 115:                 "Raven.Abstractions/Extensions/MetadataExtensions.cs"
 116:             ]
 117:         }
 118:     ],
 119:     [
 120:     ],
 121:     [
 122:         {
 123:             "FullName": "Ayende Rahien",
 124:             "Email": "ayende@ayende.com",
 125:             "EmailHash": "730a9f9186e14b8da5a4e453aca2adfe"
 126:         },
 127:         {
 128:             "FullName": "David Walker",
 129:             "Email": "david@davidwalker.org",
 130:             "EmailHash": "4e5293ab04bc1a4fdd62bd06e2f32871"
 131:         }
 132:     ],
 133:     "bt8",
 134:     "/app/rest/builds/id:588",
 135:     "RavenDB",
 136:     588,
 137:     "project3",
 138:     2509
 139: ]

It reduced the document size to 2.93KB! Awesome, nearly half of the size was gone. Except: This is actually generating utterly unreadable mess. I mean, can you look at this and figure out what the hell is going on.

I thought not. At this point, we might as well use a binary format. I happen to have a zip tool at my disposal, so I checked what would happen if I threw this through that. The end result was a file that was 1.42KB. And I had no more loss of readability than I have with the JSONH stuff.

To be frank, I just don’t get efforts like this. JSON is a text base human readable format. If you lose the human readable portion of the format, you might as well drop directly to binary. It is likely to be more efficient and you don’t lose anything by it.

And if you want to compress your data, it is probably better to use something like a compression tool. HTTP Compression, for example, is practically free, since all servers and clients should be able to consume it now. And any tool that you use should be able to inspect through it. And it is likely to generate much better results on your JSON documents than if you will try a clever format like this.

Stories from the interview room, part II

So, I just finished interviewing a candidate. His CV states that he has been working professionally for about 6 years or so. The initial interview was pretty well, and the candidate was able to talk well about his past experience. I tend to do a generic “who are you?” section, then give them a couple of questions to solve in front of Visual Studio, an architecture question and then a set of technical questions that test how much the candidate knows.

Mostly, I am looking to get an impression about the candidate, since that is all I usually have a chance to do in the span of the interview. The following is a section from the code exercise that this candidate has completed:

for (int i = 0; i < sortedArrLst.Count; i++)
    if (sortedArrLst[i].Contains(escapeSrt[0]))
        if (sortedArrLst[i].IndexOf(escapeSrt[0]) == 0)
            sortedArrLst[i] = sortedArrLst[i].Remove(0, escapeSrt[0].Length+1);
            escapeStrDic.Add(sortedArrLst[i], escapeSrt[0]);
    if (sortedArrLst[i].Contains(escapeSrt[1]))
        if (sortedArrLst[i].IndexOf(escapeSrt[1]) == 0)
            sortedArrLst[i] = sortedArrLst[i].Remove(0, escapeSrt[1].Length+1);
            escapeStrDic.Add(sortedArrLst[i], escapeSrt[1]);

Thank you, failure to use loops will get your disqualified from working at us.

Then there were the gems such as “mutex is a kind of state machine” and “binary search trees are about recursion” or the “I’ll use perfmon to solve a high CPU usage problem in production”.

Then again, the next candidate after that was quite good. Only 4 – 6 to go now.

Stories from the interview room

It is that time again, we are looking for more developers. And this time I ended up so pissed after an interview I had to call a sick colleague just to vent.

One candidate I ruled out early during the interview process. It was a somewhat sinking sensation in the pit of my stomach as I spoke with the candidate, and I couldn’t get a single actually technical description about what the candidate is actually doing now. A lot of broad descriptions, and a lot of sweeping statements, but no real technical details. But the candidate did know jQuery mobile back & forth, it appears.

The decision was made final when I asked the candidate what web framework they were using. I asked whatever they were using ASP.NET WebForms, ASP.NET MVC or ASP.Net Web API. Note that from my perspective, it is a list in a ascending worth order, and you you are using something not on it, that is a plus (it means you aren’t just using whatever is available, which is nice). So I was quite excited when the candidate said (confusedly) “none of them”. Then it took me putting on my investigative hat and asking a lot of questions about how they are actually doing things before it finally came out that they were doing ASPX.

Not knowing the name of the environment in which you are working with for the past several years… I am not sure what to call it.

At least the candidate was able to let me know how they were using that in great detail. We do stuff in Page_Init, then we have  a method that load the data from the database and put it in the ViewState or the Session, then we bind it to a grid, and most of the code is in the grid event handlers. I am sure that the candidate is a great web developer, but I would rather that this particular candidate be great at another location.

The second candidate actually passed our phone screen and was invited to an interview. We have a fairly basic interview process. Some background information for both sides, then a few questions that you need to solve in Visual Studio (and yes, you have full MSDN & Google access) and then a technical portion of the interview that include a system design and a more detailed set of technical knowledge questions.

Now, I am sure that you have heard about interesting interview questions like sort a 100 GB file in a machine 32 bits with 512MB of memory. I admit that something like that would be challenging and interesting. I would probably quite enjoy seeing how people deal with that.

That is not the type of questions that we ask. I asked for a “sort these strings” and a “calculate this tax” programs. I gave the candidate about an hour and a half, alone in a room with VS and internet connection. I am not even asking to implement your own sort, just customize the comparison function and run the standard .NET sort. And do some basic math. The candidate was unable to finish either problem on the time allotted. Now, to be fair for the candidate, the way he solved the first problem was correct and much better than many other attempts that I have seen. Note that the issue was an IComparable<string> instead of IComparable<Item> that caused the issue. It is subtle and something that I would expect an newbie to catch. But this candidate came with full 6 years of experience.

Okay, I said to myself, it is natural to be nervous when doing interviews, let us see what the candidate knows beyond that. The CV mentioned that work with mutli threading. So I began with some questions about that. But it appears that “I only know Thread and BackgroundWorker”. But the absolute clincher was when I asked the candidate about what would cause a high CPU problem and how to diagnose that: “Well, I think that there are special tools that will tell you which process is using the CPU…”

Special tools? Well, I guess Task Manager can be called special, but I am not really sure that I would call it that .And if your experience in troubleshooting stuff never go to the point where you actually look at Task Manager to see what is going on… it probably means that you don’t have meaningful experience in actually troubleshooting stuff.

On the death of Google Reader, blogging & content

On the 30 June, I had just about 30K subscribers to this blog. With the death of Google Reader, I dropped down to less than 10% of that.

This sucks, but it also means that this is a much smaller audience. Which means that it is easier to interact with. In particular, I would like to know what sort of blog posts do you, as a reader, like.

  • Features, like “see how I can do this cool thing in RavenDB”?
  • Reviews for applications, like “cringe at how horrible the code is”?
  • Challenges, like “can you figure out what is wrong with this code”?
  • Mystery codebase, like “let us read a codebase in a language I don’t know and try to figure it out”?
  • Architecture, like “let us see how we should resolve this problem”?

Or, you know, something else. I would appreciate your feedback.

Toys for geeks

I just got myself a UFO Mini Helicopter, it looks like this:

Mini Helicopter UFO Aircraft With Remote Control

This is the first helicopter that I got, and for a 30$ toy, it is an awesome amount of fun. The only complaint that I have is that this has only about 5 minutes of battery life.

I am really bad at flying it, too.

As mentioned, this is the very first helicopter that I bought, and I think that I would like to have a better one for the next time. Any recommendations from you guys?

  • I would like a better battery life. 30 minutes – 1 hour would be what I want.
  • Should be pretty resistant to crashes. I know that I am going to crash it a lot.

Any recommendations?