Ayende @ Rahien

Refunds available at head office

Inside RavenDB 3.0–Chapter 6 is done

Chapter 6 isn’t something that I actually planned to write. I thought that I would be writing a high level guide into how to use and utilize RavenDB indexes.

What came out was a detailed discussion on the actual RavenDB indexing process, including a deep look into the kind of environment that we have to deal with, the type of design decisions that we had to make and the balancing act between competing demands. I don’t know if anyone would be interested in actually reading it, since it is quite low level, and it wasn’t a lot of fun to write.

It is really hard to summarize six years of operational experience into a few words. It is even harder to show you the final result and discussing that without the full history of “we tried this, and that, and that as well”. But I think it ended up okay.

You can get it in the following URL.

Tags:

Published at

Originally posted at

Comments (3)

RavenDB 3.0–Release Plans

Well, I just committed the last feature for RavenDB 3.0, and all the tests are passing. What we are doing now is just working through the studio and running verification tests. Tomorrow or Sunday we are going to go live on our own systems with RavenDB 3.0. And shortly after that we’ll do an release candidate, followed by the actual release.

Tags:

Published at

Originally posted at

Comments (9)

Hanselminutes Podcast: Inside RavenDB with Michael Yarichuk

Go and listen to this podcast:

Scott chats with Michael Yarichuk about RavenDB. Michael works with Ayende and the RavenDB team on their document database. Scott is trying to learn about document databases and Michael helps him along the path, exploring those computer science concepts that make document databases unique.

Tags:

Published at

Originally posted at

Inside RavenDB 3.0

I’ve been working for a while on seeing where we can improve RavenDB, and one of the things that I wanted to address is having an authoritative source to teach people about RavenDB. Not just documentation, those are very good for reference, but not so good to give you a guided tour and actually impart knowledge. That is what I wanted to do, to take the last five years or so of working on and with RavenDB and distill them.

The result is about a hundred pages or so (and likely to be three or four hundred pages). In other words, I slipped up and started churning out a book Smile.

You can download the alpha version using the following link (which will be valid for the next two weeks). I want to emphasis that this is absolutely unedited, and there are likely to be error for zpelling in grammar*. Those will be fixed down the line, currently I’m mostly focused on getting the content out. Here is also the temporary cover.

Cover

Comments are welcome. And yes, this will be an actual book, in the end, which you can hold in your hand and hopefully beat someone over the head if they need to smarten up.

* The errors in that particular sentence were intentional.

Tags:

Published at

Originally posted at

Comments (15)

Complex indexing, simplified

RavenDB indexes are Turing complete, which means that you can do whatever you want with them. This is a very powerful feature, but it also come with a heavy burden. You can get yourself into some serious trouble. Take a look at this index:

image

 

We run into it during a troubleshooting session with a customer. And it was frankly quite hard to figure out what was going on.

Luckily, I could just throw this into RavenDB 3.0, and look at the indexing options:

image

This turned the above index into this:

image

Which was much clearer, but we could improve it a bit by removing the into clauses, so I ended up with:

image

Now, just from the following, can someone tell me what is the likely issue with this kind of index?

Tags:

Published at

Originally posted at

Comments (8)

On site Architecture & RavenDB consulting availabilities: Malmo & New York City

I’m going to have availability for on site consulting in Malmo, Sweden  (17 Sep) and in New York City, NY (end of Sep – beginning of Oct).

If you want me to come by and discuss what you are doing (architecture, nhibernate or ravendb), please drop me a line.

I’m especially interested in people who need to do “strange” things with data and data access. We are building a set of tailored database solutions for customers now, and we have seem customers show x750 improvement in performance when we gave them a database that was designed to fit their exact needs, instead of having to contort their application and their database to a seven dimensional loop just to try to store and read what they needed.

New RavenDB site design poll

As part of the 3.0 release, we are also going to do a full redesign of our website, and we would like to have your opinion on the matter.

Please take a look at the options and vote for your favorite: http://99designs.com/web-design/vote-u95dcs

Note, we’ll be changing the studio look & feel to match the website as well.

Tags:

Published at

Originally posted at

RavenDB 3.0 Ops: Live Tracing & Logging for production

You might have noticed a theme here Smile in where we are pushing RavenDB 3.0. This is actually an effect of how we structured our work plans, we did a lot of the new features (Voron, for example) early on, because they require a lot of time to mature. We now mostly complete the work related to user interface and exposing operational data.

This feature comes back to the old black box issue. What is the system doing? Usually, you have no way to tell. Now, you could enable debug logging, but that is a very invasive operation, requiring you to update the config file on the server, perhaps to restart the server, and isn’t really something that you can just do. This is especially true when we are talking about production system under load, where adding full logging can be very expensive.

You can now set a dynamic logging listener on a running instance, including a production instance:

image

Which then give you a live streaming view of the log:

image

Think about it like doing a tail on a log file, except that this allows you to dynamically configure what logs you are going to watch, and it will only log while you are watching. This is perfect for situations such as “what is this thing doing now?”.

Having access to the log file is great, but it usually have too much information. That is why we also added the ability to get a peek into what requests are actually executing now. This is also a production level feature, which will cause RavenDB to output all the requests so you can see them:

image

This can be very helpful in narrow down “what are the clients asking the server to do”.

Like the production log watch, this is also a feature that is meant for production, so you can subscribe to this information, and you’ll get it. But if there are no subscribers, there is not cost to this.

The HTTP trace feature can be used to watch a single database (which can be very useful on RavenHQ) or all databases (where you’ll need a server admin level of access. To watch the production log, you’ll need to be a server admin.

Tags:

Published at

Originally posted at

Comments (4)

RavenDB 3.0 Days in Sweden

We are going to do a European version of the RavenDB Conf in just over a month, coming to both Malmo and Stockholm for a full day event.

You can see the full details here, but the basic idea is that we are going to be talking about RavenDB 3.0, including showing off all the new stuff, then show a real world use case for managing high scalability systems with RavenDB. We’ll go in depth into the codebase, and then hear about how to make the best use of transformers and indexes and then end the day with a look forward into what has been slowly cooking in our labs and the grand finale with a full guide on how best to build RavenDB applications in RavenDB 3.0

We are actually going to arrive a day early, so if you are located in Malmo, and want us to come to do some on site RavenDB consulting or training on Sep 17, contract us (support@ravendb.net) and we’ll set it up.

You can register to the event using the following link.

Published at

Originally posted at

RavenDB support guarantees

As part of the 3.0 release of RavenDB, we are going to do a remap of our support contracts. We’ll make a formal announcement about it later, but the idea is to offer the following levels:

  • Standard – about $500 a year per serve, business day availability, maximum response within 2 business day.
  • Professional – about $2,000 a year per server, business day availability, maximum response within the same business day.
  • Enterprise – about $6,000 a year per server, 24x7, maximum response within two hours.

In addition to that, we’ll continue to have the community support via the mailing list. That said, I want to make it clear what kind of support guarantees with are giving in the mailing list:

  • None
  • Whatsoever

Put simply, the community mailing list is just that, a way for the community to discuss RavenDB. We look at that, and we try to help, but there is no one assigned to monitor the mailing list, this is pretty much the team waiting for the code to compile or the current test run to complete and deciding to check the mailing list instead of Twitter or the latest Cat Video.

Any support on the mailing list is provided on a ad hoc basis, and should absolutely not be something that you rely on. In particular, people with EMERGENCY or PRODUCTION ISSUE aren’t going to get any special treatment. If you need support, and if you run critical systems, you probably do, you need to purchase that. We provide guarantees and follow through for the commercial support packages.

I’m writing this post after an exchange of words in the mailing list when a user complained that I went offline at 1 AM on a Saturday night and not continue to provide him free support.

Published at

Originally posted at

Comments (9)

Guids are evil nasty little creatures that make me cry

You might have noticed that I don’t like Guids all that much. Guids seems like a great solution when you need to generate an id for something. And then reality intervenes, and you have a non understandable system problem.

Leaving aside the size of the Guid, or the fact that it is not sequential, two pretty major issues with an identifier, the major problem is that it is pretty much opaque for the users.

This was recently thrown in my face again as part of a question in the RavenDB mailing list. Take a look at the following documents. Do you think that those two documents belong to the same category or not?

image

One of the problems that we discovered was that the user was searching for category 4bf58dd8d48988d1c5941735, and the document had category was 4bf58dd8d48988d14e941735. And it drove everyone crazy about how could it be that this wasn’t working.

Here are those Guids again:

  • 4bf58dd8d48988d1c5941735
  • 4bf58dd8d48988d14e941735

Do you see it? I’m going to be putting some visual space and show you the difference.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Here they are:

  • 4bf58dd8d48988d1c5941735
  • 4bf58dd8d48988d14e941735

And if that isn’t enough for you to despise Guids. Feel free to read them to someone else over the phone, or try to find them in a log file. Especially when you have to deal with several of those dastardly things.

I have a cloud machine dedicated to generating and disposing Guids, I hope that in a few thousands years, I can kill them all.

Tags:

Published at

Originally posted at

Comments (28)

RavenDB 3.0 Status Update

We are nearly there. You can see that we are starting to expose all the cool stuff we have done through the past year.

Feature map is currently frozen, and we expect to have a release candidate within 2 weeks and a stable release 4 weeks after that.  There is one big ticket item that still remains, and the rest is just UI work. We’ll probably do the release candidate with an intentionally ugly UI, then roll the new theme for the UI for the actual release, since we don’t want to hold the schedule just for this release.

Things are very exciting here, and we can’t wait for you to take the fruits of our labor for spin. I trust you’ll be impressed Smile.

Tags:

Published at

Originally posted at

Comments (3)

Production analysis and trouble shooting with RavenDB

The annoying thing about software in production is that it is a  black box. It just sits there, doing something, and you have very little input into what. Oh, you can look at the CPU usage and memory consumption, you can try to figure out what is going on from the kind of things that the system will tell you this process is doing. But for the most part ,this is a black box. And not even one that is designed to let you figure out what just happened.

With RavenDB, we have made a very conscious effort to avoid being a black box. There are a lot of end points that you can query to figure out exactly what is going on. And you can use different endpoints to figure out different problems.  But in the end, while that was very easy for us to use, those aren’t really meant for end users. They are meant for our support engineers, mostly. 

We got tired of sending over “give me the output of the following endpoints” deal. We wanted a better story, something that would be easier and more convenient all around .So we sat down and thought about this, and came up with the idea of the Debug Info Package.

image

This deceptively simple tool will capture all of the relevant information from RavenDB into a single zip file that you can mail support. It will also give you a lot of details about the internals of RavenDB at the moment this was produced:

  • Recent HTTP requests
  • Recent logs
  • The database configuration
  • What is currently being indexed?
  • What are the current queries?
  • What tasks are being run?
  • All the database metrics
  • Current status of the pre-fetch queue
  • The database live stats

And if that wasn’t enough, we have the following feature as well:

image

 

We get the full stack of the currently running process!

You can see how this look in full in the here:

stacks

 

But the idea is that we have cracked open the black box, and it is now so much easier to figure out what is going on!

Tags:

Published at

Originally posted at

Comments (9)

RavenDB On Azure

It took a while, but it is here. The most requested feature on the Azure Store is here:

Embedded image permalink

This is currently only available on the East US region. That is going to change, but it will take a bit of time. You can vote on which regions you want RavenHQ on Azure to expand to.

RavenHQ on Azure can be used in one of two ways. You can purchase it via the Azure Marketplace, in which case you have to deal only with a single invoice, and you can manage everything through the Azure site. However, the Azure Marketplace doesn’t currently support prorated and tiered billing, which means that the plans that you purchase in the marketplace have hard limits on data. You could also purchase those same plans directly from RavenHQ and take advantage of usage based billing, which allows you to use more storage than what’s included in the plan at a prorated cost.

RavenHQ is now offering a lower price point for replicated plans, so you don’t have to think twice before jumping into the high availability option.

Tags:

Published at

Originally posted at

Comments (3)

What is my query doing?

Recently we had to deal with several customers support requests about slow queries in RavenDB. Now, just to give you some idea about the scope. We consider a query slow if it takes more than 50ms to execute (excluding client side caching).

In this case, we had gotten reports about queries that took multiple seconds to run. That was strange, and we were able to reproduce this locally, at which point we were hit with a “Duh!” moment. In all cases, the underlying issue wasn’t that the query took a long time to execute, it was that the result of the query was very large. Typical documents were in the multi megabyte ranges, and the query returned scores of those. That means that the actual cost of the query was just transporting the data to the client.

Let us imagine that you have this query:

session.Query<User>()
.Where(x => x.Age >= 21)
.ToList();

And for some reason it is slower than you would like. The first thing to do would probably be to see what is the raw execution times on the server side:
RavenQueryStatistics queryStats;
session.Query<User>()
.Customize(x=>x.ShowTimings())
.Statistics(out queryStats)
.Where(x => x.Age > 15)
.ToList();

Now you have the following information:
  • queryStats.DurationMilliseconds – the server side total query execution time
  • queryStats.TimingsInMilliseconds – the server side query execution time, per each distinct operation
    • Lucene search – the time to query the Lucene index
    • Loading documents – the time to load the relevant documents from disk
    • Transforming results – the time to execute the result transformer, if any
  • queryStats.ResultSize – the uncompressed size of the response from the server

This should give you a good indication on the relative costs.

In most cases, the issue was resolved by the customer by specifying a transformer and selecting just the properties they needed for their use case. That transformed (pun intended) a query that returned 50+ MB to one that returned 67Kb.

Tags:

Published at

Originally posted at

Metrics hunt in RavenDB

You might have noticed that we are paying a lot of options for operational concerns in RavenDB 3.0. This is especially true because we moved away from performance counters to metrics.net, which means that is is much easier and light weight to add metrics to RavenDB.

As a result of that, we are adding a lot of stuff that will be very useful for ops team. From monitoring the duration of queries to the bandwidth available for replication to a host of other stuff.

What I wanted to ask is what kind of things do you want us to track?

Tags:

Published at

Originally posted at

Comments (9)

Small touches: Complex text in RavenDB

This is what we call a “mini feature”, something that you’ll probably not notice unless pointed out to you. Often, we want to store documents that contain multi line strings properties. JSON has a very simple way to handle that:

image

And it works, and if the text is small, it is even readable. But it isn’t really working on anything even remotely complex or long. So we have worked to fix that:

image

Now you can actually read this much more easily. We run into this when we look at stack trace information, where without line breaks, it is nearly impossible to see what is going on.

Tags:

Published at

Originally posted at

Comments (11)

RavenDB Replication Topology Visualizer

Even more goodies are coming in RavenDB 3.0. Below you can see how to visualize the replication topology in a RavenDB Cluster. You can also see that the t5 database is down (marked as red).

image

This is important, since this gives us the ability to check the status of the topology from the point of view of the actual nodes. So a node might be up for one server, but not for the other, and this will show up here.

Beside, it is a cool graphic that you can use in your system documentation and it is much easier to explain Smile.

Tags:

Published at

Originally posted at

Comments (4)

Geo distribution and high availability in RavenDB

A customer asks in the mailing list:

Due to data protection requirements, we have to store a users data closest to where they signed up. For example if I sign up and I’m in London, my data should be stored in the EU.

Given this, how do we ensure when replicating (we will have level 4 redundancy eventually), that any data originally written to a node within say the EU does not get replicated to a node in the states?

The answer here is to use to features of RavenDB together. Sharding and Replication. It is a good thing that they are orthogonal and can work together seamlessly.

Here is how it looks like:

image

The London based user will be sharded to the Ireland server. This server will be replicating to other Ireland based server (or to other servers in the EU). The data never leaves the EU (satisfying the data protection rules), but we get the high availability that we desire.

At the same time, Canadian customers will be served from a nearby states based servers, and they, too, will be replicating to nearby servers.

From a deployment standpoint, what we need to do is the following:

  • Setup a geo distributed sharded cluster using the user’s location.
  • Each shard would be a separate server, which is replicating to N other servers in the nearby geographical area.

And that is pretty much it…

Tags:

Published at

Originally posted at

Comments (6)

Avoid where in a reduce clause

We got a customer question about a map/reduce index that produced the wrong results. The problem was a problem between the conceptual model and the actual model of how Map/Reduce actually works.

Let us take the following silly example. We want to find all the animal owner’s that have more than a single animal. We define an index like so:

// map
from a in docs.Animals
select new { a.Owner, Names = new[]{a.Name} }

// reduce
from r in results
group r by r.Owner into g
where g.Sum(x=>x.Names.Length) > 1
select new { Owner = g.Key, Names = g.SelectMany(x=>x.Names) }

And here is our input:

{ "Owner": "users/1", "Name": "Arava" }    // animals/1
{ "Owner": "users/1", "Name": "Oscar" }    // animals/2
{ "Owner": "users/1", "Name": "Phoebe" }   // animals/3

What would be the output of this index?

At first glance, you might guess that it would be:

{ "Owner": "users/1", "Names": ["Arava", "Oscar", "Phoebe" ] }

But you would be wrong. The actual output of this index… It is nothing. This index actually have no output.

But why?

To answer that, let us ask the following question. What would be the output for the following input?

{ "Owner": "users/1", "Name": "Arava" } // animals/1

That would be nothing, because it would be filtered by the where in the reduce clause. This is the underlying reasoning why this index has no output.

If we feed it the input one document at a time, it has no output. It is only if we give it all the data upfront that it has any output. But that isn’t how Map/Reduce works with RavenDB. Map/Reduce is incremental and recursive. Which means that we can (and do) run it on individual documents or blocks of documents independently. In order to ensure that, we actually always run the reduce function on the output of each individual document’s map result.

That, in turn, means that the index above has no output.

To write this index properly, I would have to do this:

// map
from a in docs.Animals
select new { a.Owner, Names = new[]{a.Name}, Count = 1 }

// reduce
from r in results
group r by r.Owner into g
select new { Owner = g.Key, Names = g.SelectMany(x=>x.Names), Count = g.Sum(x=>x.Count) }

And do the filter of Count > 1 in the query itself.

Tags:

Published at

Originally posted at

Comments (2)

Map/Reduce visualizer, take II

Yes, I talked about this already, but we made some additional improvements that make it even cooler.

Here is a document:

image

And here is the index:

image

Now, let us look at what happens when we go to the map/reduce visualizer:

image

This is a highly zoomed out picture, let us zoom a little (click it for higher zoom):

image

As you can see, we can see all the documents that have any reduce keys shared with the reduce keys from the document we started from. That is going to make explaining, and debugging, map/reduce so much easier.

For that matter, here is an example of the visualizer showing us a multi step reduce. Which is an optimization that happens when we have a lot of entries for the same reduce key. Now we can actually show you how this works:

image

Pretty cool!

Tags:

Published at

Originally posted at

Comments (1)

Introducing inefficiencies into RavenDB, on purpose

Yes, I choose the title on purpose. The topic of this post is this issue. In RavenDB, we use replication to ensure high availability and load balancing. We have been using that for the past five years now, and in general, it has been great, robust and absolutely amazing when you need it.

But like all software, it can run into interesting scenarios. In this case, we had three nodes, call them A, B and C. In the beginning, we had just A & B and node A was the master node, for which all the data was written and node B was there as a hot spare. The customer wanted to upgrade to a new RavenDB version, and they wanted to do that with zero downtime. They setup a new node, with the new RavenDB server, and because A was the master server, they decided to replicate from node B to the new node. Except… nothing appear to be happening.

No documents were replicating to the new node, however, there was a lot of CPU and I/O. But nothing was actually happening. The customer opened a support call, and it didn’t take long to figure out what was going on. The setup the replication between the nodes with the default “replicate only documents that were changed on this node”. However, since this was the hot spare node, no documents were ever changed on that node. All the documents in the server were replicated from the primary node.

The code for that actually look like this:

public IEnumerable<Doc> GetDocsToReplicate(Etag lastReplicatedEtag)
{
foreach(var doc in Docs.After(lastReplicatedEtag)
{
if(ModifiedOnThisDatabase(doc) == false)
continue;
yield return doc;
}
}


var docsToReplicate = GetDocsToReplicate(etag).Take(1024).ToList();
Replicate(docsToReplicate);

However, since there are no documents that were modified on this node, this meant that we had to scan through all the documents in the database. Since this was a large database, this process took time.

The administrators on the server noted the high I/O and that a single thread was constantly busy and decided that this is likely a hung thread. This being the hot spare, they restarted the server. Of course, that aborted the operation midway, and when the database started, it just started everything from scratch.

The actual solution was to tell the database, “just replicate all docs, even those that were replicated to you”. That is the quick fix, of course.

The long term fix was to actually make sure that we abort the operation after a while, report to the remote server that we scanned up to a point, and had nothing to show for it, and go back to the replication loop. The database would then query the remote server for the last etag that was replicated, it would respond with the etag that we asked it to remember, and we’ll continue from that point.

The entire process is probably slower (we make a lot more remote calls, and instead of just going through everything in one go, we have to stop, make a bunch of remote calls, then resume). But the end result is that the process is now resumable. And an admin will be able to see some measure of progress for the replication, even in that scenario.

Tags:

Published at

Originally posted at

Comments (11)

RavenDB Events

In the following months, there are going to be quite a few RavenDB Events, and I have been remiss in talking about them.

  • The Triangle RavenDB User Group is meeting in Raleigh. Jul 30, you can register here.
  • Mauro is going to be giving a 3 days course in RavenDB in London. Aug 11 – Aug 13, you can register here.
  • The Arizona RavenDB User Group is meeting in Scottsdale. Aug 12, you can register here.
  • We are going to be giving 2 full days events in Sweden. Sep 18 – Sep 19, you can register here.
  • I’m also going to be speaking in NSB Conf in New York City. Sep 29 – Sep 30, you can register here.
  • The RavenDB in Action should come out in October. You can purchase the early access copy here.
  • We are going to show up for Oredev with a lot of exciting stuff. Nov 4 – Nov 7, you can register here.

We also have a couple of surprises, but I’ll keep them for later.

Tags:

Published at

Originally posted at

Comments (1)

Help us select a theme for RavenDB 3.0

We are getting closer & closer for a 3.0 release. And as I mentioned, we are doing a lot of UI work. I’m quite excited about that, even though I don’t think you’ll be aware of all of the changes that we are adding.

But here is one that will be very visible, the new theme for the studio. So far, we have gone with the default theme, and pushed the actual design for later. Now we have some options to consider:

Dark:

dark

Darkly:

darkly

Flatly:

Flatly

Lumen:

lumen
RClassic

RClassic
Space lab:

space lab

 

What do you think is best?

Tags:

Published at

Originally posted at

Comments (76)