Making code fasterPulling out the profiler

time to read 2 min | 377 words

After doing all I can without reaching out to the profiler, and managing to get x45 performance gain, let us see what the profiler actually tells us. We’ll use the single threaded version, since that is easier.

Here it is:


We can see that dictionary operations take a lot of time, which is to be expected. But what is very surprising is that the date time calls are extremely expensive in this case.

The relevant code for those is here. You can see that it is pretty nice, but there are a bunch of things there that are likely costing us. The exception inside the method prevents in lining, there is error handling here that we don’t need, since we can safely assume in this exercise that the data is valid, etc.

So I changed the ParseTime to do this directly, like so:

And that saved us 11%, just this tiny change.

Here are our current costs:


Note that we reduced the cost of parse significantly ( at the cost of error handling, though ), but there are still a lot of work being done here. It turns out that we were actually measuring the time to write to the summary file as well (that is what all those FormatHelpers calls are), so that dirty the results somewhat, but nevermind.

The next place that we need to look at is the Dictionary, it is expensive, even though the usage of FastRecord means that we only need a single call per line, that isn’t so much fun. Note that it is using the GenericEqualityComparer, can we do better?

Trying to create my own equality comparer for longs doesn't really help.


So we’ll go back to the parallel version with the ParseTime optimization, and we are now running at 628 ms. And at this rate, I don’t think that there is a lot  more room for improvements, so unless someone suggests something, we are done.

More posts in "Making code faster" series:

  1. (24 Nov 2016) Micro optimizations and parallel work
  2. (23 Nov 2016) Specialization make it faster still
  3. (22 Nov 2016) That pesky dictionary
  4. (21 Nov 2016) Streamlining the output
  5. (18 Nov 2016) Pulling out the profiler
  6. (17 Nov 2016) I like my performance unsafely
  7. (16 Nov 2016) Going down the I/O chute
  8. (15 Nov 2016) Starting from scratch
  9. (14 Nov 2016) The obvious costs
  10. (11 Nov 2016) The interview question