Ayende @ Rahien

Oren Eini aka Ayende Rahien CEO of Hibernating Rhinos LTD, which develops RavenDB, a NoSQL Open Source Document Database.

You can reach me by:

oren@ravendb.net

+972 52-548-6969

Posts: 6,950 | Comments: 49,481

filter by tags archive
time to read 4 min | 797 words

In my last post, I talked about how to store and query time series data in RavenDB. You can query over the time series data directly, as shown here:

You’ll note that we project a query over a time range for a particular document. We could also query over all documents that match a particular query, of course. One thing to note, however, is that time series queries are done on a per time series basis and each time series belong to a particular document.

In other words, if I want to ask a question about time series data across documents, I can’t just query for it, I need to do some prep work first. This is done to ensure that when you query, we’ll be able to give you the right results, fast.

As a reminder, we have a bunch of nodes that we record metrics of. The metrics so far are:

  • Storage – [ Number of objects, Total size used, Total storage size].
  • Network – [Total bytes in, Total bytes out]

We record these metrics for each node at regular intervals. The query above can give us space utilization over time in a particular node, but there are other questions that we would like to ask. For example, given an upload request, we want to find the node with the most free space. Note that we record the total size used and the total storage available only as time series metrics. So how are we going to be able to query on it? The answer is that we’ll use indexes. In particular, a map/reduce index, like the following:

This deserve some explanation, I think. Usually in RavenDB, the source of an index is a docs.[Collection], such as docs.Users. In this case, we are using a timeseries index, so the source is timeseries.[Collection].[TimeSeries]. In this case, we operate over the Storage timeseries on the Nodes collection.

When we create an index over a timeseries, we are exposed to some internal structural details. Each timestamp in a timeseries isn’t stored independently. That would be incredibly wasteful to do. Instead, we store timeseries together in segments. The details about how and why we do that don’t really matter, but what does matter is that when you create an index over timeseries, you’ll be indexing the segment as a whole. You can see how the map access the Entries collection on the segment, getting the last one (the most recent) and output it.

The other thing that is worth noticing in the map portion of the index is that we operate on the values of the time stamp. In this case, Values[2] is the total amount of storage available and Values[1] is the size used. The reduce portion of the index, on the other hand, is identical to any other map/reduce index in RavenDB.

What this index does, essentially, is tell us what is the most up to date free space that we have for each particular node. As for querying it, let’s see how that works, shall we?

image

Here we are asking for the node with the least disk space that can contain the data we want to write. This can be reduce fragmentation in the system as a whole, by ensuring that we use the best fit method.

Let’s look at a more complex example of indexing time series data, computing the total network usage for each node on a monthly basis. This is not trivial because we record network utilization on a regular basis, but need to aggregate that over whole months.

Here is the index definition:

As you can see, the very first thing we do is to aggregate the entries based on their year and month. This is done because a single segment may contain data from multiple months. We then sum up the values for each month and compute the total in the reduce.

image

The nice thing about this feature is that we are able to aggregate large amount of data and benefit from the usual advantages of RavenDB map/reduce indexes. We have already massaged the data to the right shape, so queries on it are fast.

Time series indexes in RavenDB allows us to merge time series data from multiple documents, I could have aggregated the computation above across multiple nodes to get the total per customer, so I’ll know how much to charge them at the end of the month, for example.

I would be happy to know hear about any other scenarios that you can think of for using timeseries in RavenDB, and in particular, what kind of queries you’ll want to do on the data.

time to read 4 min | 633 words

RavenDB 5.0 is coming soon and the big new there is time series support. We have gotten to the point where we can actually show off what we can do, which makes me very happy. You can use the nightlies builds to explore time series support in RavenDB 5.0. Client side packages for 5.0 are also available.

image

I went ahead and created a new database and created some documents:

image

Time series are often used for monitoring, so I decided to go with the flow and see what kind of information we would want to store there. Here is how we can add some time series data to the documents:

I want to focus on this for a bit, because it is important. A time series in RavenDB has the following details:

  • The timestamp to associate to the values – in the code above, this is the current time (UTC)
  • The tag associated with the timestamp – in the code above, we record what devices and interfaces these measurements belong to.
  • The measurements themselves – RavenDB allows you to record multiple values for a single timestamp. We threat them as an array of values, and you can chose to put them in a single time series or to split them.

Let’s assume that we have quite a few measurements like this and that we want to look at the data. You can explore things in the Studio, like so:

image

We have another tab in the Studio that you can look at which will give you some high level details about the timeseries for a particular document. We can dig deeper, too, and see the actual values:

image

You can also query the data to see the patterns and not just the individual values:

The output will look like this:

image

And you can click on the eye to get more details in chart form. You can see a little bit of this here, but it is hard to do it justice with a small screen shot:

image

Here is what the data you get back from this query:

The ability to store and process time series data is very important for monitoring, IoT and healthcare systems. RavenDB is able to do quite well in these areas. For example, to aggregate over 11.7 million heartrate details over 6 years at a weekly resolution takes less than 50 ms.

We have tested timeseries that contained over 150 million entries and we can aggregate results back over the entire data set in under three seconds. That is a nice number, but it doesn’t match what dedicated time series databases can do. It represents a rate of about 65 million rows / second. ScyllaDB recently published a benchmark in which they talk about billion rows / sec. But they did that on 83 nodes, so they did just 12 million / sec per node. Less than a fifth of RavenDB’s speed.

But that is being unfair, to be honest. While timeseries queries are really interesting, we don’t really expect users to query very large amount of data using raw queries. That is what we have indexes for, after all. I’m going to talk about this in depth in my next post.

time to read 1 min | 100 words

On Tuesday, January 21, 2020 10:30 AM Eastern Time, I’ll be doing a webinar show casing some of the unique features of RavenDB.

We talk a lot about new features and exciting stuff that we work on, but RavenDB has been around for a decade and some of the most impressive stuff that we have are still features that I built around 2009.

I’m going to give a guided tour into some of the features that don’t share much of the limelight but can be real work horses in your application.

You can register to the webinar here.

time to read 5 min | 829 words

In RavenDB Cloud, we routinely monitor the usage of the RavenDB Cluster that our customers run. We noticed something strange in one of them, the system utilization didn’t match the expected load given the number of requests the cluster was handling. We talked to the customer to try to figure out what was going and we had the following findings (all details are masked, naturally, but the gist is the same).

  • The customer stores millions of documents in RavenDB.
  • The document sizes range from 20KB – 10 MB.

Let’s say that the documents in questions were BlogPosts which had a Comments array in them. We analyzed the document data and found the following details:

  • 99.89% of the documents had less than 100 comments in them.
  • 0.11% of the documents (a few thousands) had more than 100 comments in them.
  • 99.98% of the documents had less than 200 comments in them.
  • 0.02% of the documents had more than 200 comments in them!

Let’s look at this 0.02%, shall we?

image

The biggest document exceeded 10 MB and had over 24,000 comments in it.

So far, that didn’t seem like a suboptimal modeling decision, which can lead to some inefficiencies for those 0.02% of the cases. Not ideal, but no big deal. However, the customer also defined the following index:

from post in docs.Posts from comment in post.Comments select new { ... }

Take a minute to look at this index. Note the parts that I marked? This index is doing something that look innocent. It index all the comments in the post, but it does this using a fanout. The index will contain as many index entries as the document has comments. We also need to store all of those comments in the index as well as in the document itself.

Let’s consider what is the cost of this index as far as RavenDB is concerned. Here is the cost per indexing run for different sized documents, from 0 to 25 comments.

image

This looks like a nice linear cost, right? O(N) cost as expected. But we only consider the cost for a single operation. Let’s say that we have a blog post that we add 25 comments to, one at a time. What would be the total amount of work we’ll need to do? Here is what this looks like:

image

Looks familiar? This is O(N^2) is all its glory.

Let’s look at the actual work done for different size documents, shall we?

image

I had to use log scale here, because the numbers are so high.

The cost of indexing a document with 200 comments is 20,100 operations. For a thousand comments, the cost is 500,500 operations.  It’s over 50 millions for a document with 10,000 comments.

Given the fact that the popular documents are more likely to change, that means that we have quite a lot of load on the system, disproportional to the number of actual requests we have, because we have to do so much work during indexing.

So now we know what was the cause of the higher than expected utilization. The question here, what can we do about this? There are two separate issues that we need to deal with here. The first, the actual data modeling, is something that I have talked about before. Instead of putting all the comments in a single location, break it up based on size / date / etc. The book also has some advice on the matter. I consider this to be the less urgent issue.

The second problem is the cost of indexing, which is quite high because of the fanout. Here we need to understand why we have a fanout. The user may want to be able to run a query like so:

This will give us the comments of a particular user, projecting them directly from their store in the index. We can change the way we structure the index and then project the relevant results directly. Let’s see the code:

As you can see, instead of having a fanout, we’ll have a single index entry per document. We’ll still need to do more work for larger documents, but we reduced it significantly. More importantly, we don’t need to store the data. Instead, at query time, we use the where clause to find documents that match, then project just the comments that we want back to the user.

Simple, effective and won’t cause your database’s workload to increase significantly.

time to read 4 min | 798 words

RavenDB has two separate APIs that allow you to get push notifications from the database. The first one is the Subscriptions API, which allows you to define a query such as:

And then subscribe to it like so:

RavenDB will now push batches of orders that match your query to the client. This is done in a reliable manner. If the client fails for any reason, it can reconnect and resume from where it left off. If the server failed, the cluster will automatically reassign the work to another node and the client will pick up from where it left off. The subscription is also persistent, that means that whenever you connect to it, you don’t start from the beginning. After the subscription has caught up with all the documents that match the query, it isn’t over. Instead, the client will wait for new or updated documents to come in so the server can push them immediately. The typical latency between a document change and the subscription processing it is about twice the ping time between the client and server (so in the order of milliseconds). Only a single client at a time can have a particular subscription open, but multiple clients can contend on the subscription. One of them will win and the others will wait for the subscription to become available (when the first client stop / fail / crash, etc).

This make subscriptions highly suitable for business processing. It is reliable, you already have high availability on the server side and you can easily add that on the client side. You can use complex queries and do quite a bit of work on the database side, before it ever reaches your code. Subscriptions also allow you to run queries over revisions, so instead of getting the current state of the document, you’ll be called with the (prev, current) tuple on any document change. That gives you even more power to work with.

On the other hand, subscriptions requires RavenDB to manage quite a bit of (distributed) state and as such consume resources at the cluster level.

The Changes API, on the other hand, has a very different model, let’s look at the code first, and then discuss this in details:

As you can see, we can subscribe to changes on a document or a collection. We actually have quite a bit of events that we can respond to. A document change (by id, prefix or collection), an index (created / removed, indexing batch completed, etc), an operation (created / status changed / completed), a counter (created / modified), etc.

Some things that can be seen even from just this little bit of code. The Changes API is not persistent. That means, if you’ll restart the client and reconnect, you’ll not get anything that already happened. This is intended for ongoing usage, not for critical processing. You also cannot do any complex queries with changes. You have the filters that are available and that is it. Another important distinction is that with the Subscription API, you are getting the document (and can also include additional ones), but with the Changes API, you’re getting the document id only.

The most common scenario for the Changes API is to implement this:

image

Whenever a user is editing a particular document, you’ll subscribe to the document and if it changed behind the scenes, you can notify the user about this so they won’t continue to edit the document and get an optimistic concurrency error on save.

The Changes API is also used internally by RavenDB to implement a lot of features in the Studio and for tracking long running operations from the client. It is lightweight and requires very little resources from the server (and none from the cluster). On the other hand, it is meant to be a best effort feature. If the Changes connection has failed, the client will transparently reconnect to the server and re-subscribe to all the pending subscriptions. However, any changes that happened while the client was not connected are lost.

The two APIs are very similar on the surface, both of them allow you to get push notifications from RavenDB but their usage scenarios and features are very different. The Changes API is literally that, it is meant to allow you to watch for changes. Probably because you have a human sitting there and looking at things. It is meant to be an additional feature, not a guarantee. The Subscriptions API, on the other hand, is a reliable system and can ensure that you’ll not miss out of notifications that matter to you.

You can read more about Subscriptions in RavenDB in the book, I decided a whole chapter to it.

time to read 4 min | 628 words

I posted about the @refresh feature in RavenDB, explaining why it is useful and how it can work. Now, I want to discuss a possible extension to this feature. It might be easier to show than to explain, so let’s take a look at the following document:

The idea is that in addition to the data inside the document, we also specify behaviors that will run at specified times. In this case, if the user is three days late in paying the rent, they’ll have a late fee tacked on. If enough time have passed, we’ll mark this payment as past due.

The basic idea is that in addition to just having a @refresh timer, you can also apply actions. And you may want to apply a set of actions, at different times. I think that the lease payment processing is a great example of the kind of use cases we envision for this feature. Note that when a payment is made, the code will need to clear the @refresh array, to avoid it being run on a completed payment.

The idea is that you can apply operations to the documents at a future time, automatically. This is a way to enhance your documents with behaviors and policies with ease. The idea is that you don’t need to setup your own code to execute this, you can simply let RavenDB handle it for you.

Some technical details:

  • RavenDB will take the time from the first item in the @refresh array. At the specified time, it will execute the script, passing it the document to be modified. The @refresh item we are executing will be removed from the array. And if there are additional items, the next one will be schedule for execution.
  • Only the first element in the @refresh array only. So if the items aren’t sorted by date, the first one will be executed and the persisted again. The next one (which was earlier than the first one) is already ready for execution, so will be run on the next tick.
  • Once all the items in the @refresh array has been processed, RavenDB will remove the @refresh metadata property.
  • Modifications to the document because of the execution of @refresh scripts are going to be handled as normal writes. It is just that they are executed by RavenDB directly. In other words, features such as optimistic concurrency, revisions and conflicts are all going to apply normally.
  • If any of the scripts cause an error to be raised, the following will happen:
    • RavenDB will not process any future scripts for this document.
    • The full error information will be saved into the document with the @error property on the failing script.
    • An alert will be raised for the operations team to investigate.
  • The scripts can do anything that a patch script can do. In other words, you can put(), load(), del() documents in here.
  • We’ll also provide a debugger experience for this in the Studio, naturally.
  • Amusingly enough, the script is able to modify the document, which obviously include the @refresh metadata property. I’m sure you can imagine some interesting possibilities for this.

We also considered another option (look at the Script property):

The idea is that instead of specifying the script to run inline, we can reference a property on a document. The advantage being is that we can apply changes globally much easily. We can fix a bug in the script once. The disadvantage here is that you may be modifying a script for new values, but not accounting for the old documents that may be referencing it. I’m still in two minds about whatever we should allow a script reference like this.

This is still an idea, but I would like to solicit your feedback on it, because I think that this can add quite a bit of power to RavenDB.

time to read 3 min | 532 words

Once you put a document inside RavenDB, this is pretty much it, as far as RavenDB is concerned. It will keep your data safe, allow to query it, etc. But it doesn’t generally act upon it. There are a few exceptions, however.

RavenDB supports the @expires metadata attribute. This attribute allows you to specify a specific time in which RavenDB will automatically delete the document. This is very useful for expiring documents. The classic example being a password reset token, which should be valid for a period of time and then removed.

Here is what this looks like:

image

And you can configure the frequency in which we’ll check for expired documents in the studio.

image

Expiring documents, however, isn’t all that RavenDB can do. RavenDB also has an additional feature, refreshing documents. You can mark a document to be refreshed by specifying the @refresh metadata attribute, like so:

image

It is easy to understand what @expires do. At a given time, it will delete the document, because it expired. But what does refresh do? Well, at the specified time, a document with the @refresh metadata attribute will be updated by RavenDB to remove the @refresh metadata attribute from the document.

Yep, that is all. In other words, the document above would turn into:

image

That is all. Surely this is the most useless feature ever. You set a property that will be removed at a future time, but the only thing that the property can say is when to remove itself. What kind of feature is this?

Well, this is a case where by itself, this would be a pretty useless feature. But the point of this feature is that this will cause the document to be updated. At that point, it is a normal update, which means that:

  • The document will be re-indexed.
  • The document will be sent over ETL.
  • The document will be sent to the relevant subscriptions.

The last point is the most important one. Here is an example of a typical subscription:

As you can see, this is a pretty trivial subscription, but it filters out commands that are set to refresh. What does this mean? It means that if the @refresh attribute is set, we’ll ignore the document. But since RavenDB will automatically clear the attribute when the refresh timer is hit, we gain a powerful ability.

We now have the ability to process delayed commands. In other words, you can save a document with a refresh and have it processed by a subscription at a given time.

Expanding on this, you can do the same using ETL. So you have a document that will be sent over to the ETL destination at a given time. You can also do the same for indexing as well.

And now this seemingly trivial / useless feature become a pivot for a whole new set of capabilities that you get with RavenDB.

time to read 3 min | 542 words

imageOne of our developers recently got a new machine, and we were excited to see what kind of performance we can get out of it. It is an AMD Ryzen 9, 12 cores @ 3.79 Ghz with 32 GB of RAM. The disk used was Samsung SSD 970 EVO Plus 500 GB.

This isn’t an official benchmark, to be fair. This is us testing on how fast the machine is. As such, this is a plain vanilla Windows 10 machine, with no effort to perform any optimizations. Our typical benchmark involves loading all of stack overflow into RavenDB, so we’ll have enough data to work with. Here is what things looked like midway through:

image

As you can see, the write speed we are able to get is impressive.

We were able to insert all of stack overflow, a bit over 52GB in 3 and a half minutes, at a rate of about 300 MB / sec sustained.

Then we tested indexing.

  • Map/Reduce on users by registration month (source ~6 million users) – under a minute.
  • Full text search on users – two and a half minutes.
  • Simple index on questions by tag (over 18 million questions & answers) – 11.5 minutes.
  • Full text search on all questions and answers – 33 minutes.

Remember, these numbers are for indexing everything for the first time. It is worth noting that RavenDB dedicates a single thread per index, to avoid hammering the system with too much work. That means that this indexes were building concurrently with one another.

Here is the system utilization while this was going on:

image

Finally, we tested some other key scenarios (caching disabled in all of them):

  • Reading documents (small working set, representing recent questions)  - 243,371 req / ses at 512 MB / sec.
  • Full random reads (data size exceed memory, so disk hits) – 15,393.66 res / sec at 13.4 MB / sec.

These two are really interesting numbers. The first one, we generate queries to specific documents over an over (with no caching). That means that RavenDB is able to answer them from memory directly. The idea is to simulate a common scenario of a working set that can fit entirely in memory.

The second one is different. The data size on disk is 52 GB and we have 32 GB available for us. We generate random queries here, for different documents each time. We ensure that the queries cannot be served directly from memory and that RavenDB will have to hit the disk. As you can see, even under this scenario, we are doing fairly well. As an aside, it helps that the disk is good. We tried running this on HDD once. The results were… not nice.

The final test we did was for writes, writing a small document to RavenDB. We got 118,000 writes/sec on a sustained basis, with about 32MB / sec in data throughput. Note that we can do more, but playing with the system configuration, but we are already at high enough rate that it probably wouldn’t matter.

All in all, that is a pretty nice machine.

time to read 4 min | 698 words

A map/reduce index in RavenDB can be configured to output its value to a collection. This seems like a strange thing to want to do at first. We already got the results of the index, in the index. Why do we want to duplicate that by writing them to collections?

As it turns out, this is a pretty cool feature, because it enable us to do quite a lot. It means that we can apply anything that work on documents on the results of a map/reduce index. This list include:

  • Map/Reduce – so you can create recursive / chained map/reduce operations.
  • ETL – so you can push aggregated data to another location, allowing distributed aggregation at scale easily.
  • Subscription / Changes – so you can get notified when an aggregated value has been changed.

The key about the list above is that all of them don’t require you to know upfront the id of the generated documents. Indeed, RavenDB uses documents ids like the following for such documents:

image

Technically speaking, you can compute the id. RavenDB uses a predictable algorithm to generate such an id, but practically speaking, it can be hard to figure out exactly what the inputs are for the id generation. That means that certain document related features are not available. In particular, you can’t easily:

  • Include such a document
  • Load it directly (you have to query)

So we need a better option to deal with it. The way RavenDB solves this issue is by allowing you to specify a pattern for the output collection, like so:

image

As you can see, we have a map/reduce index that group by the company and year (marked in blue). We output the collection to YearlySummary, as shown in the previous image.

The pattern (marked in red) specify how we should name the output documents. Here is the result of this index:

image

And here is what this document looks like:

image

Huh?

This is strange, you probably think. This is the document we need to show the summary for companies/9-A in 1998, but there is no such data here. Instead, you’ll notice that the document collection is references (marked in red) and that it points to (marked in blue) the actual document with the data. Why do we do things this way?

A map/reduce document is free to output multiple results for the same reduce key, so we need to handle multiple documents here. We also have to deal with multiple reduce outputs that end up with the same pattern. For example, if we use map/reduce by day, but our pattern only specify the month, we’ll have multiple reduce keys that end up with the same pattern.

In practice, because RavenDB has great support for following documents by id, it doesn’t matter. Here is how I can use this index in a query:

This single query allow us to ask a question about companies (those that reside in London, in this case), as well as sales total data for a particular year. Note that this doesn’t do any joins or anything expensive. We have the information at hand, and can just use it.

You’ll notice that the pattern we specified is using both items that we reduce by. But that isn’t mandatory. We can also use this:

image

Here we only specify the company in the pattern. What would be the result?

image

Now we get the sales total for the company, on a per year basis.

We can now run the following query:

And this will give us the following output:

image

As you can imagine, this opens up quite a few possibilities for advanced features. In particular, it means that you can make it even easier for you to show and process aggregate information and work through complex object models.

time to read 1 min | 130 words

imageWe have just rolled out GCP support for RavenDB Cloud. The support is still in the beta stage, because we like to be conservative, but you now have the ability to deploy RavenDB clusters on GCP at the click of a button.

All the usual features are supported. You can provision a new cluster, RavenDB Cloud will monitor and manage it for you, and you can focus on deliver actual value, instead of deployment concerns. As usual with RavenDB Cloud, we are deployed to all public regions in GCP.

For the beta period, we only support production clusters (no single development instances) on GCP.

As usual, we would love to have your feedback.

FUTURE POSTS

No future posts left, oh my!

RECENT SERIES

  1. RavenDB 5.0 (2):
    21 Jan 2020 - Exploring Time Series–Part II
  2. Webinar (2):
    15 Jan 2020 - RavenDB’s unique features
  3. Challenges (2):
    03 Jan 2020 - Spot the bug in the stream–answer
  4. Challenge (55):
    02 Jan 2020 - Spot the bug in the stream
  5. re (26):
    27 Dec 2019 - Writing a very fast cache service with millions of entries
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats