Ayende @ Rahien

My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:


+972 52-548-6969

, @ Q c

Posts: 6,026 | Comments: 44,842

filter by tags archive

Geo Location & Spatial Searches with RavenDB–Part IV-Searching

time to read 3 min | 451 words

Now we have all of the data loaded in, we need to be able to search on it. In order to do that, we define the following index:


It is a very simple one, mapping the start and end of each range for each location.

The next step is actually doing the search, and this is where we run into some issues. The problem was with the data:


Let us take the first range and translate that to IP addresses in the format that you are probably more used to:

Start: End:

Yep, it is little endian vs. big endian here to bite us once more.

It took me a while to figure it out, I’ll admit. In other words, we have to reverse the IP address before we can search on it properly. Thankfully, that is easily done, and we have the following masterpiece:


The data source that we have only support IPv4, so that is what we allow. We reverse the IP, then do a range search based on this.

Now we can use it like this:

var location = session.GetLocationByIp(IPAddress.Parse(""));

Which tells us that this is a Mountain View, CA, USA address.

More importantly for our purposes, it tells us that this is located at: 37.4192, -122.0574 We will use that shortly to do spatial searches for RavenDB events near you, which I’ll discuss in my next post.

Oh, and just for fun. You might remember that in previous posts I mentioned that MaxMind (the source for this geo location information) had to create its own binary format because relational databases took several second to process each query?

The query above completed in 53 milliseconds on my machine, without any tuning on our part. And that is before we introduce caching into the mix.



Has anybody tried to load it into relational database, add some indexes, and perform such query (apart from MaxMind, just to verify)? I somehow find it hard to belive that it takes several seconds...

Ayende Rahien

AG, I assume that MaxMind did.


@Ayende of course he did, what I meant is that somebody should verify that. Perhaps he is doing something wrong there. By the size of the data, it doesn't seem to me that it should take that long. Anyway that's a thing to try.


Index lookup in a relational database? It takes no more than few milliseconds. I don't have to do any coding to know that... Probably MaxMind guys were talking about database without any indexing that you get by importing the csv file in dumbest possible way. Only then it would take few seconds to do a table scan...


Materialized views are the genius of RavenDB.

Not sure why RavenDB tries to proliferate denormalization on the writing side. I think we should write normalized and read denormalized. That's the best of both worlds.

I think that denormalization on the writing side is generally completely misguided.

Ayende Rahien

Tobi, Very simple reason, it is HARD to track denormalization on reads.


So equivalent sql query run on my machine (nothing special really...) and ms sql (with the database imported but no optimisation at all) takes something between 60 and 170 ms. Tried for a few IPs. That is waaay less than several seconds, and has plenty of places it can be improved (with indexes).



that might be a reason. I just know that materialized views in SQL Server are a joy to use. I write normalized, I read with full performance. Joins become no-ops at runtime. Often it is possible to save on sorting and filtering as well. Complex queries become single range scans on some index.

Ayende Rahien

Tobi, I wrote a bunch of posts about materialized views a few years back, they are great, but in RDMBS they have severe limitations, and for the same reason you can't really do full denormalization during read.


AG, I agree it's only 2 mill rows, Microsoft Sql will do such a search in milliseconds, no doubt about that.

Max Mind is terrible wrong, but that is not up to discuss that :), but maybe they were talking about Excel LOL

Comment preview

Comments have been closed on this topic.


No future posts left, oh my!


  1. Technical observations from my wife (3):
    13 Nov 2015 - Production issues
  2. Production postmortem (13):
    13 Nov 2015 - The case of the “it is slow on that machine (only)”
  3. Speaking (5):
    09 Nov 2015 - Community talk in Kiev, Ukraine–What does it take to be a good developer
  4. Find the bug (5):
    11 Sep 2015 - The concurrent memory buster
  5. Buffer allocation strategies (3):
    09 Sep 2015 - Bad usage patterns
View all series


Main feed Feed Stats
Comments feed   Comments Feed Stats