Ayende @ Rahien

My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:


+972 52-548-6969

, @ Q c

Posts: 6,128 | Comments: 45,550

filter by tags archive

Searching ain’t simple: solution

time to read 5 min | 845 words

On my last post, I descried the following problem:


And stated that the following trivial solution is the wrong approach to the problem:

select d.* from Designs d 
 join ArchitectsDesigns da on d.Id = da.DesignId
 join Architects a on da.ArchitectId = a.Id
where a.Name = @name

The most obvious reason is actually that we are thinking too linearly. I intentionally showed the problem statement in terms of UI, not in terms of a document specifying what should be done.

The reason for that is that in many cases, a spec document is making assumptions that the developer should not. When working on a system, I like to have drafts of the screens with rough ideas about what is supposed to happen, and not much more.

In this case, let us consider the problem from the point of view of the user. Searching by the architect name makes sense to the user, that is usually how they think about it.

But does it makes sense from the point of view of the system? We want to provide good user experience, which means that we aren’t just going to provide the user with a text box to plug in some values. For one thing, they would have to put in the architect full name as it is stored in our system. That is going to be a tough call in many cases. Ask any architect what the first name of Gaudi is, and see what sort of response you’ll get.

Another problem is how to deal with misspelling, partial names, and other information. What if we actually have the architect id, and are used to type that? I would much rather type 1831 than Mies Van Der Rohe, and most users that work with the application day in and day out would agree.

From the system perspective, we want to divide the problem into two separate issues, finding the architect and finding the appropriate designs. From a user experience perspective, that means that the text box is going to be an ajax suggest box, and the results would be loaded based on valid id.

Using RavenDB and ASP.Net MVC, we would have the following solution. First, we need to define the search index:


This gives us the ability to search across both name and id easily, and it allows us to do full text searches as well. The next step is the actual querying for architect by name:


Looks complex, doesn’t it? Well, there is certainly a lot of code there, at least.

First, we look for an a matching result in the index. If we find anything, we send just the name and the id of the matching documents to the user. that part is perfectly simple.

The interesting bits happen when we can’t find anything at all. In that case, we ask RavenDB to find us results that might be the things that the user is looking for. It does that by running a string distance algorithm over the data in the database already and providing us with a list of suggestions about what the user might have meant.

We take it one step further. If there is just one suggestion, we assume that this is what the user meant, and just return the results for that value. If there is more than that, we sent an empty result set to the client along with a list of alternatives that they can suggest to the user.

From here, the actual task of getting the designs for this architect becomes as simple as:


And it turns out that when you think about it right, searching is simple.



When you say "It does that by running a string distance algorithm" do you mean you already have this capability implemented in RavenDB? Is it something like this one (what we ended up using inhouse):


His implementation discussed here: http://www.codegrunt.co.uk/2010/11/02/C-Sharp-Norvig-Spelling-Corrector.html

Ayende Rahien

Peter, That is done inside RavenDB, yes.

Jason Meckley

"And it turns out that when you think about it right, searching is simple." A large portion of this is because RavenDB does all the heavy lifting for us. All we need to do is map the results from RavenDB to our view model. Zero Friction FTW!


I think there is a typo in the code second code sample. I'm not sure what "Results = new NameAndId[0]" is.

Please forgive me if I am being dense, but why is it necessary to specify both the query type and the result type in Session.Query<ArchitectsSearch.Result, ArchitectsSearch>? It seems that ArchitectsSearch.Result could be inferred since ArchitectsSearch is declared to be AbstractIndexCreationTask<Architect, ArchitectsSearch.Result>. Is it possible for Session.Query to be called as Session.Query<SomethingThatIsNot_ArchitectsSearch.Result, ArchitectsSearch>?

Ayende Rahien

Chris, Results = new NameAndId[0] -- Create a new empty array of NameAndId

The reason that we have two generic params is that it is NOT possible for us to infer the first parameter from the second, and yes, there are reasons why you would have different values there.


I see, you are using lucene SpellChecker. Can I assume it is using statistics from the indexed ravendb documents to determine order of suggested terms?

Ayende Rahien

Peter, Yes, that is part of that.


I'd refactor that code. I don't like the return at the end of ArchitectsByName. If there are result return immedately. Guard condition.


I don't like the name of the action when it can be searched by id as well as name... nit-picking tho.

I've written something similar for my project, this stuff took way too long to do with a relational database, it really is frictionless.


Sorry for the side question but... these screen caps aren't from Visual Studio IDE, are they?


@Paulo, that looks like Sublime Text 2 - http://www.sublimetext.com/2


@Bill, thank you! I'll take a look at it!

Martin Doms

Is that q.Suggest() method provided by RavenDB, or is that an extension method that the application developer would implement? Is it the "string distance algorithm" you mentioned?


I may be missing the whole idea of the post, but what should be the solution if the label was "diagrams for architects"?


Two typo corrections:

  1. Change "descried" to "described"
  2. Change "archiect" to "architect" (in the balsamiq GIF)

Comment preview

Comments have been closed on this topic.


  1. The worker pattern - one day from now

There are posts all the way to May 30, 2016


  1. The design of RavenDB 4.0 (14):
    26 May 2016 - The client side
  2. RavenDB 3.5 whirl wind tour (14):
    25 May 2016 - Got anything to declare, ya smuggler?
  3. Tasks for the new comer (2):
    15 Apr 2016 - Quartz.NET with RavenDB
  4. Code through the looking glass (5):
    18 Mar 2016 - And a linear search to rule them
  5. Find the bug (8):
    29 Feb 2016 - When you can't rely on your own identity
View all series


Main feed Feed Stats
Comments feed   Comments Feed Stats