We just finished rolling back our internal servers migration to 3.0 back to 2.5. That was quite unpleasant, and was actually noticed by users.
That isn’t pleasant, but it is always better if we get the egg all our face than if it is a customer. The actual issue that we run into was pretty interesting.
The problem is that the database we use for running this blog (as well as most of our internal systems) has been through… a lot. It has gone through pretty much every released version, and many that weren’t actually released.
That means that from storage perspective (only of interest to RavenDB developers), it is a bit of a mess. That in turn meant that we had to do extra work to convert the storage from the 2.5 version to the 3.0 version. That caused enough memory to be used that we hit our limits on memory usage, and failed to convert it to the 3.0 version.
That meant that it was stuck. That is actually one of the reasons that we test those things on our own systems, so that was great.
The not so great part was that we also uncovered another interesting bug (actually, several of them in conjunction). The new studio had a tendency to read the stats from all the available databases, if the number we had was small enough. That was done so we can show the number of documents on each database in the databases page.
That meant that we would effectively start all of them in parallel (and consume resources that weren’t actually needed).
And that, in turn, exposed a race condition (in Esent!) that can resulted in a hard process crash. That was the hardest thing to get over with, because obviously we don’t have source access to Esent and it was kinda of hard to pinpoint where this was actually happening and why.
All fixed and good now, and ready to try again.