RavenDB indexingexact()
When you search for some text in RavenDB, you’ll use case insensitive search by default. This means that when you run this query:
You’ll get users with any capitalization of “Oren”. You can ask RavenDB to do a case sensitive search, like so:
In this case, you’ll find only exact matches, including casing. So far, that isn’t really surprising, right?
Under what conditions will you need to do searches like that? Well, it is usually when the data itself is case sensitive. User names on Unix are a good example of that, but you may also have Base64 data (where case matters), product keys, etc.
What is interesting is that this is a property of the field, usually.
Now, how does RavenDB handles this scenario? One option would be to index the data as is and compare it using a case insensitive comparator. That ends up being quite expensive, usually. It’s cheaper by far to normalize the text and compare it using ordinals.
The exact() method tells us how the field is supposed to be treated. This is done at indexing time. If we want to be able to query using both case-sensitive and case-insensitive manner, we need to have two fields. Here is what this looks like:
We indexed the name field twice, marking it as case sensitive for the second index field.
Here is what actually happens behind the scenes because of this configuration:
The analyzer used determines the terms that are generated per index field. The first index field (Name) is using the default LowerCaseKeywordAnalyzer analyzer, while the second index field (ExactName) is using the default exact KeywordAnalyzer analyzer.
More posts in "RavenDB indexing" series:
- (20 Oct 2022) exact()
- (19 Apr 2013) An off the cuff stat
Comments
I'm probably not understanding it. But can't I now find the same record on the exact index with both Oren and oren?
Because that is not behaviour I would expect from exact searching.
Or does the term Oren have different metadata or something as the term oren in the exact index? So it can distinguish on the metadata that it shouldn't use the term oren when searching for exact(ExactName='oren')
CHristiaanS,
With
exact()
you have terms that are case sensitive. Without it, we normalize to lower case.That's the idea.
Comment preview