Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,546
|
Comments: 51,161
Privacy Policy · Terms
filter by tags archive
time to read 3 min | 536 words

There are times when you write clean, easily to understand code, and there are times when you see 50% of your performance goes into DateTime parsing, at which point you’ll need to throw nice code out the window, put on some protective gear and seek out that performance hit that you need so much.

Note to the readers: This isn’t something that I recommend you’ll do unless you have considered it carefully, you have gathered evidence in the form of actual profiler results that show that this is justified, and you covered it with good enough tests. The only reason that I was actually able to do anything is that I know so much about the situation. The dates are strictly formatted, the values are stored as UTF8 and there are no cultures to consider.

With that said, it means that we are back to C’s style number parsing and processing:

Note that the code is pretty strange, we do upfront validation into the string, then parse all those numbers, then plug this in together.

The tests we run are:

Note that I’ve actually realized that I’ve been forcing the standard CLR parsing to go through conversion from byte array to string on each call. This is actually what we need to do in RavenDB to support this scenario, but I decided to test it out without the allocations as well.

All the timings here are in nanoseconds.

image

Note that the StdDev for those test is around 70 ns. And this usually takes about 2,400 ns to run.

Without allocations, things are better, but not by much. StdDev goes does to 50 ns, and the performance is around 2,340 ns, so there is a small gain from not doing allocations.

Here are the final results of the three methods:

image

Note that my method is about as fast as the StdDev on the alternative. With an average of 90 ns or so, and StdDev of 4 ns. Surprisingly, LegacyJit on X64 was the faster of them all, coming in at almost 60% of the LegacyJit on X86, and 20% faster than RyuJit on X64. Not sure why, and dumping the assembly at this point is quibbling, honestly. Our perf cost just went down from 2400 ns to 90 ns. In other words, we are now going to be able to do the same work at 3.66% of the cost. Figuring out how to push it further down to 2.95% seems to insult the 96% perf that we gained.

And anyway, that does leave us with some spare performance on the table if this ever become a hotspot again*.

* Actually, the guys on the performance teams are going to read this post, and I’m sure they wouldn’t be able to resist improving it further Smile.

time to read 2 min | 344 words

This is a small part from a larger benchmark that we run:

 

The index in question is using a DateTime field, and as you can see, quite a lot of time is spent in translating that. 50% of our time, in fact. That is… not so nice.

The question now is why we do it? Well, let us look at the code:

image

Here we can see several things, first, there is the small issue with us allocating the string to check if it is a date, but that isn’t where the money is. That is located in the TryParseExact.

This method is actually quite impressive. Given a pattern, it parses the pattern, then it parse the provided string. And if we weren’t calling it hundreds of thousands of times, I’m sure that it wouldn’t be an issue.  But we are, so we are left with writing our own routine to do this in a hard coded manner.

I built the following benchmark to test this out:

image

As you can see, this is pretty much identical to our code, and should tell us how good we are. Here are the benchmark results:

Method

Platform

Jit

Toolchain

Runtime

Median

StdDev

ParseDateTime

X64

RyuJit

Host

Host

2,458.2915 ns

102.7071 ns

ParseDateTime

X86

Host

Clr

Clr

2,506.7353 ns

142.7946 ns

ParseDateTime

X86

LegacyJit

Host

Host

2,443.4806 ns

51.4903 ns

In my next post, I’ll show what I came up with that can beat this.

FUTURE POSTS

  1. Partial writes, IO_Uring and safety - about one day from now
  2. Configuration values & Escape hatches - 5 days from now
  3. What happens when a sparse file allocation fails? - 7 days from now
  4. NTFS has an emergency stash of disk space - 9 days from now
  5. Challenge: Giving file system developer ulcer - 12 days from now

And 4 more posts are pending...

There are posts all the way to Feb 17, 2025

RECENT SERIES

  1. Challenge (77):
    20 Jan 2025 - What does this code do?
  2. Answer (13):
    22 Jan 2025 - What does this code do?
  3. Production post-mortem (2):
    17 Jan 2025 - Inspecting ourselves to death
  4. Performance discovery (2):
    10 Jan 2025 - IOPS vs. IOPS
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}