RavenDB 3.5 Features: Data Exploration

time to read 3 min | 447 words

RavenDB is doing a pretty great job for being a production database, in fact, we have designed it upfront to only have features that make sense to have for robust production systems.

In particular, we don’t have any form of ad-hoc queries. A query always hits an index, so it is very fast. Even what we call dynamic queries in RavenDB are actually creating an index behind the scene. This is pretty awesome for normal production usage, but it does have some limitations when you want to explore the data. This can be because you are a developer trying to find a particular something, and you just want to quickly fire off random queries. You don’t care about the costs, and you don’t want to generate indexes. Or you can be an admin that needs to get a particular report from the system and you want to play around with the details until you get everything right.

In order to serve those needs, RavenDB 3.5 is going to have a really nice feature, explicit data exploration.

For example, let us say that I want to count the number of unique words in all of my posts, I can do it using the following:

image

Note that the actual query is pretty meaningless, and I’m writing this at 1AM with a baby nearby that make funny noises, so the Linq statement there works, but can probably be better.

The point here is that to demo what is going on. We write a simple Linq statement, and can run it against our database, and then gather the results back. It is like having LinqPad directly inside the RavenDB studio. In fact, that is the number one scenario that we envision for this feature, replacing LinqPad usage by having a native capability.

Now, some caveats. As you can see, you can select to limit the query duration as well as the number of documents it will operate on. That give us a quick way to explore the data without putting too much load on the server. You can even take the output here and throw it directly to Excel. “Sam, can you give the a breakdown of orders this year by month and country? Just email me the Excel spreadsheet”.

Note that this is intended as a user feature, it isn’t something that we provide an API for. It is there for admins or developers that are figuring things out, an admin feature, not something that you want to use on production.