The world’s smallest indexing code
I’m writing a chapter about indexing in RavenDB and I wanted to have the reader grasp the notion of indexing more easily. I came up with the following code that should explain what is going on:
I think that this does a good job of explaining how an index is actually used, at least to the point where even if you don’t understand indexing, you can sort of grasp what is going on behind the scenes.
Thoughts?
Comments
For me it displays the concept very well. Returning func took some time to resolve the WTF, but then it became clear? What about the usage example that demonstrates the dramatic difference in search time and advocates the Func usage? Also as the usage example will be like
the name like "IndexUsers" or "GetIndexOnUsers" will work better IMHO.
EQR, I'm actually talking about exactly these differences in the book, to explain why the index is so fast, even if it does more, but only if you use it multiple times.
An index usually allows you perform range searches and get results sorted, not only exact match searches. This is why I would prefer something like this
Jesus, So change the
Dictionary
toSortedDictionary
, same thing.@Ayende SortedDictionary does not allow range search, only exact match.
Jesus, Sorry,
SortedList
expose an interface that also allows easy binary search there.I understand the index creation, but I must say I am not sure how the return lambda statement generates a <string, List<User>> and not List<User>, though the nested return must be doing that, somehow.
peter, The lambda is actually a method that accepts a
string
and returns aList<User>
delegate TResult Func<TArg1, TResult>(TArg1 arg)
I was wondering if local function that uses
yield return
would be simpler (since lambda can't do that). Something like:But I think your version is actually better, especially since everyone knows how creating a
List
works, while not everyone will know the details ofyield return
.Svick, An inner method is a cool way to do that. And I'm not sure how many people understand inner lambdas. I like this better.
@Ayende. SortedList doesn't allow range search either. SortedDictionary and SortedList are sorted collections, but neither take advantage of its sorted nature to perform range search.
Jesus, You get directly access to the indexer in
SortedList
, which allow you to do that.@Ayende. No.
SortedList.Keys.IndexOf
returns -1 if the key is not present, so, How do you searchkey BETWEEN from AND to
iffrom
is not in the keys?Jesus, I meant that you can do that directly using the indexer.
@Ayende, but that would require a full scan, however the code I suggested doesn't.
Jesús, since the SortedList allows indexing directly into the list, you could write your own binary search for the appropriate starting key index instead of scanning the entire list.
HI, I have a question.
Lambda function returned use: users List<User> by reference as I saw. 1. does lambda code ensures it will not be GC collected after derefferenced outside?
Thx
@Alex, I have already wrote my own binary search over a sorted array. Did you see my code ? I don't need SortedList at all.
@Ayende, @Alex Davison.
SortedSet
would be a good fit. I allows you to perform ordered scan, exact match and range search, no need to implement your own binary search. Please see this code.Yuriy, Yes, the lambda ensure that the dictionary & list will not be collected.
This might be just a little silly: ;)
Alternatively if you want to avoid the reliance on the order of operation from the index++ interacting with the lookup code.
Obviously the point is to then understand how ToLookup works, which I'm guessing was your real point?
Comment preview