Ayende @ Rahien

It's a girl

What is up with RavenDB 2.0?

The last stable version of RavenDB 1.0 was released about six months ago, ever since then, we were hard at work adding new features, improving things and in general doing Good Work.

Looking back into the last half a year of work, it is actually quite hard to pick out the major stuff. There was so much we did. That said, I think that I can pick up some things to salivate on for 2.0.

First & foremost, we drastically improved the RavenDB Management Studio (RDBMS). We spent a lot of time there, and you can now do pretty much everything you want in RavenDB through the studio. This seems like a stupid major feature, right? After all, this is just the UI that was updated, and RavenDB is actually the server stuff. But it provides you with at least an order of magnitude better tooling and ability to work more easily with RavenDB.


And that is really just the tip of the iceberg in terms of what is new in the studio.

But even though the changes to the studio are probably the most obvious ones, we have done a tremendous amount of work of work the server itself. Here are some of the highlights.

Operational Support – We spent a lot of time on making sure that ops people will have a lot of reasons to be happy with this new release. You can monitor this using any standard monitoring tool (SCOM, MOM, HP OpenView, etc). We expose a lot more data through performance monitors and logs. And we even added dedicated endpoints that you can hit to gather monitoring information (which database is currently doing what, for example) that would give ops the full view about what is actually going on there.

Core Bundles -  We always had bundles, and we implement a lot of features through them. But in 2.0, we took a lot of the bundles and move them to the core, so now you can configure & use them easily.

Setting them up on new DB setup:


Setting up replication through the UI:


We have management UI support for all of the core bundles now, which makes using them a lot easier.

Changes() API – this allows you to get PUSH notifications from the RavenDB Server, you can subscribed to events from a particular documents, a set of documents or an index. That allows you to notify the user if something have changes without the need to do any expensive polling.  The studio was actually doing a lot of polling, but we changed pretty much all of that to be PUSH based now.

Here is an usage sample:

   1: store.Changes()
   2:     .ForDocument("users/1")
   3:     .Subscribe(notification =>
   4:         {
   5:             using(var session = store.OpenSession())
   6:             {
   7:                 var user = session.Load<User>(notification.Name);
   8:                 Console.WriteLine("Wow! " + notification.Name + " changed. New name: " + user.Name);
   9:             }
  10:         });

Yes, it is as easy as this Smile.

Eval Patching – you can now run JS scripts against your objects, to modify them on the server side. This is perfect if you want to do migrations (if you actually need to, usually you don’t), want to run some complex modification on the server side or just need to do something as an administrator.


More authentication options & control – We now have a far easier time defining and controlling who can access the server, and what databases they can touch.

Here is an example of allowing the network service to have access to the tryout database:


And here we have an example of defining API Keys:


This allows your to define an API Key for a particular application very easily (vs. defining users ,which is usually how you handle admin / ops people coming in).

Indexing Performance – We have spent a lot of time on optimizing the way we are handling indexing. In particular, we now do a lot of work to make sure that we don’t wait for IO and we use as many cores as we can to get things done even faster. Even when you throw a lot of data at RavenDB, indexing catch up very quickly and the indexing latency is far lower.

Better map/reduce – Our map/reduce implementation have been drastically improved, allowing us to re-process and update existing results with a lot less computational & IO needs at the large scale of things.

Better facets – We have completely remapped the facets support, reducing the per facet value cost that used to be there. Now we are able to generate facets quickly regardless of how many facet values you have in a facet, and we even support paging & sorting of facets.

Better IN Query – This sounds silly, but supporting an efficient IN query is important for a lot of scenarios, especially when the number of items in the IN is large. We have a dedicated support for doing that efficiently and easily now.

Async – We matched all the standard client capabilities in our Async API, that means that we support async sharding, async replication failover, and the whole shebang. It means that using RavenDB with C# 5.0 is just as easy as you can imagine, it is all been done for you.

Sharding improvements – In addition to the async sharding support, we worked on improving sharding itself, giving you more integration & extensions points and made the whole thing just a tad bit smarter by default.

Cloud Backup – Backing up is hard to do, and we have decided to make it easier. In addition to supporting all enterprise backup tools, and having the ability to manually trigger backups or export (full or incremental), we now have the ability to schedule automatic backups to the cloud.


You can setup period backups to Amazon Glacier or Amazon S3, and it will incrementally backup your database there as we go along.

CSV Import / Export – Silly, but data still goes around mostly in flat files, and RavenDB now support importing & exporting data in CSV format, so you can easily pull some data into excel or push some data that you got from some other source.

Debuggability – We now expose a lot more hooks for you to use when you debug things. You can look directly into how the index is built (index entries, stored fields, etc), you can inspect the intermediate steps of RavenDB’s map/reduce processes, how the IO processes for loading document work, indexing times and a lot more.

There are more, but I think that this is enough for now.


Published at

Originally posted at

Comments (17)

Dogfooding is CRITICAL, the story of a bug

We use RaveDB to handle our entire internal infrastructure. That has several reasons, to start with, RavenDB is a joy to work with, so it cuts down on the dev time for new internal features. More importantly, however, it let us try RavenDB in real world conditions. That seems obvious, because as much as you try, you can never really simulate real production environments.

But even more important is the concept of production data. The important thing about production data is that it is dirty. You don’t get data that is all nice and simple and all the same as you have when you generate the data, or when you are creating unit and system tests.

In this case, we had an interesting problem. We had a map reduce index similar to this one:

   1: public class UsersByCountry : AbstractIndexCreationTask<User, UsersByCountry.Result>
   2: {
   3:     public class Result
   4:     {
   5:         public string Country { get; set; }
   6:         public int Count { get; set; }
   7:     }
   9:     public UsersByCountry()
  10:     {
  11:         Map = users =>
  12:               from user in users
  13:               select new {user.Country, Count = 1};
  14:         Reduce = results =>
  15:                  from result in results
  16:                  group result by result.Country
  17:                  into g
  18:                  select new
  19:                  {
  20:                      Country = g.Key,
  21:                      Count = g.Sum(x => x.Count)
  22:                  };
  23:     }
  24: }

This is a pretty standard thing to have, but I noticed that we had a difference from the 1.0 results in our production data. Investigating further, it appeared that this was the root issue:

   1: using (var session = store.OpenSession())
   2: {
   3:     session.Store(new User { Country = "Israel" });
   4:     session.Store(new User { Country = "ISRAEL" });
   5:     session.Store(new User { Country = "israel" });
   6:     session.SaveChanges();
   7: }

With RavenDB 1.0, due to fairly esoteric implementation details, this would generate the following values:

  • {"Country": "Israel", "Count": 1}
  • {"Country": "ISRAEL", "Count": 1}
  • {"Country": "israel", "Count": 1}

This matches what you’ll get using Linq to Objects, and that was fine by me. I could see an argument for doing the reduce using case insensitive reduce, but then you have the argument about which or the representation is the one you should use, etc.

In the version that I tested, I actually got:

  • {"Country": "Israel", "Count": 3}
  • {"Country": "ISRAEL", "Count": 3}
  • {"Country": "israel", "Count": 3}

Now, that was wrong. Very wrong.

As it turned out, once I managed to recover from the palpitations that this issue gave me, the actual reason was pretty easy to figure out. Some of our code was case sensitive, some of our code was not. That meant that under this condition, we would feed the map/reduce engine with duplicate entries, per the number of various casing combinations that we had.

Spooky bug, but once we narrowed down what the actual problem was, very easy to resolve.


Published at

Originally posted at

Comments (6)

Clearing up versioning, 1.2 vs 2.0 in RavenDB

When we started full fledged active development on the next version of RavenDB, we used the tag name “1.2” for that release version. We wanted to take the time and punch some big tickets items that were going to be very hard to do with small incremental releases. I think we did quite a good job of that. But one of the major things that we wanted to do was to be able to go back and fix some of the early on design decisions that we made.

Some of them were minor (null handling improvements during indexing), some of them were pretty major (changing indexing format for dates to human readable ones). Some are obvious (moving things around in the client API to make it easier to avoid mistakes), some are not so much (far superior handling of DateTime and DateTimeOffset). Interestingly enough, we didn’t get any big breaking changes, only a lot of minor course corrections. For the most part, I expect people to move from 1.0 to 2.0 without even noticing that things got much better.

We stilled called this release RavenDB 1.2, but we have some breaking changes, and we have done tremendous amount of work on the system. We started getting pushback from users and customers about the version number. This isn’t just some minor release, this is a major step up.

Thus, what we were calling 1.2 became 2.0. And I’ll be posting about what exactly is new in RavenDB 2.0 in just a bit…


Published at

Originally posted at

Comments (5)

Upgrading to RavenDB 2.0 in our production systems

Well, we tried to upgrade to RavenDB 2.0 build 2152, and then we quickly had to roll things back. The reason for that, by the way, was authentications issues.

In RavenDB 2.0, we did a lot of stuff to tighten security. In particular, we actively try to push you away from using the system database, and into using named databases. And we also made it easier to specify which databases which users has access to (and what kind of an access).

Unfortunately, while it worked perfectly during testing, going to production revealed that we had a few… issues in it.

Surprisingly enough, those issues weren’t in the logic or how it worked. Those issues boiled down to…. case sensitivity. It turned out that both the user names and the database names were case sensitive in the permission configuration, leading to… well, you probably saw that the site threw a 401 error.

Fixing that config issue was pretty high on my list of things to fix. If only because case sensitivity in this place was just a trap waiting to happen. Once we figured out what was going on, it was relatively easy to fix. This post is now written on a system running the very latest version of RavenDB 2.0.


Published at

Originally posted at

Comments (3)

What is up with RavenDB 2.0? Performance…

Well, one thing that we put a lot of focus on was performance. In order to test that, I had a dataset of 4.66 million documents (IMDB data set, if you care) as well as two indexes defined.

The results for RavenDB 2.0 (drum roll):

Loading 4.66 millions records in 44 minutes. Average rate of less then half a millisecond per document.

But wait, what about the indexes? Well, RavenDB index stuff as they come, and as we were inserting the documents, they were indexed along the way. That meant that 11 seconds after we were done putting 4.66 millions documents to RavenDB, we were done indexing (across all indexes).

Pretty nice perf, even if I say so myself.


Published at

Originally posted at

Comments (19)

Design patterns in the test of time: Command, Redux

In my previous post about the command pattern, I gushed about how much I loved it. That doesn’t mean that the command pattern as originally envisioned is still completely in fashion.

In particular, the notion of “Undo” was  one of the major features in the command pattern’s cap. Today, that is almost never the case. Sure, if you are building an application such as an editor. Something like Photoshop or a text editor would find the notion of commands with the ability to have an undo stack very compelling. Other than that, however, that is a very rare need.

In most business scenarios, there really is no good way to undo things. How would you implement SendEmailCommand.Undo(), for example? But even something like PlaceOrder.Undo() is a lot more complex and hard to do than you would think. The mere notion of undoing the operation assumes that this isn’t going to have any side affects. But cancelling an order may result in cancellation fees, require you to ship back things you got back, end. It is not “Undoing PlaceOrder”, rather that is a whole different and distinct business process, usually represented by another command: CancelOrder.

Another common issue that people have is the degeneration of the entire architecture to something like:

CommandExecuter.Execute(Command cmd);

To that I answer, more power to you! I love code that is composed of a lot of small classes all doing things about the same way. There is no easier way to look at a system, and that allows you to quite easily add additional functionality to the system easily. That said, mind how you handle routing in that scenario. I have seen people go into the “when a request comes to this URL, let us invoke the following commands” in XML. One of the reasons that people dislike this approach is how you actually call this. If just getting to the command executer is hard and involved, you lose a lot of the advantages.

This popped up in the mailing list, and I really dislike it. The notion of Composite Command. A command that can execute multiple commands. Now, from a programming point of view, I can easily see why you would want to do that. The PlaceOrderCommand operation is composed of a lot of smaller commands. But, and this is important, the notion of Composite Commands basically mean that you get the same thing as the PlaceOrderCommand, but you just lost the name. And naming is important. Almost as important, error handling is quite different between different business scenarios, and you sometimes end up with something like:

   1: var placeOrderCommand = new CompositeCommand(
   2:    new RegisterOrderCommand(),
   3:    new ReserveStockCommand(),
   4:    new ChargeCardCommand(),
   5:    new ShipOrderCommand()
   6: )
   7: {
   8:    Sequential = true,
   9:    StopOnError = true
  10: }

And this is simple, how do you handle an error in the ShipOrderCommand after you already charged the card, for example?

RavenDB 2.0 Release Candidates

Well, it is about this time. We run out of work with RavenDB 2.0, and now we pretty much have no choice but to get into the release cycle.

We aren’t done yet, but we are close. Close enough that we started running the current build in our own production systems. There is still stuff to be done, in particular, performance and bug fixes.

You can get the latest stuff here. We would love to get some people to hammer at things and see if they can break it.


Published at

Originally posted at

Comments (4)

Don’t play peekaboo with support, damn it!

One of the things that we really pride ourselves with Hibernating Rhinos is the level of support. Just to give you some idea, today & yesterday we had core team members on the phone with people (not customers, yes) who are having problems with RavenDB for quite some time.

Now, I understand that you may not always have the information to give, but what you have, give me! So I can help you.

From a recent exchange in the mailing list:

var clickCount = session.Query<TrackerRequest>().Where(t => t.TrackerCreated == doc.Created).Where(t => t.Type == Type.Click).Count();

This gives:

"Non-static method requires a target"

To which I replied:

What is the full error that you get? Compile time? Runtime?

The answer I got:

This is a document I'm trying to 'Count':


  "Created": "2012-11-15T16:12:42.1775747",

  "IP": "",

  "TrackerCreated": "2012-11-15T14:12:16.3951000Z",

  "Referrer": "http://example.com",

  "Type": "Click"


Raven terminal gives:

Request # 172: GET     -     3 ms - <system>   - 200 - /indexes/Raven/DocumentsByEntityName?query=Tag%253ATrackerRequests&start=0&pageSize=30&aggregation=None&noCache=-1129797484

        Query: Tag:TrackerRequests

        Time: 2 ms

        Index: Raven/DocumentsByEntityName

        Results: 3 returned out of 3 total.

By the way, you might note that this ISN’T related in any way to his issue. This query (and document) were gotten from the Studio. I can tell by the URL.

Then there was this:


I mean, seriously, I am happy to provide support, even if you aren’t a customer yet, but don’t give me some random bit of information that has absolutely nothing to do to the problem at hand and expect me to guess what the issue is.

Relevant information like the stack trace, what build you are on, what classes are involved, etc are expected.


Published at

Originally posted at

Comments (9)

Design patterns in the test of time: Command

The command pattern is a behavioral design pattern in which an object is used to represent and encapsulate all the information needed to call a method at a later time.

More about this pattern.

I adore this pattern. If this pattern had a paypal account, I would donate it money on a regular basis.

In general, the notion of encapsulating the method call into an object (like the functor sin C++) is an incredibly powerful idea, because is separate the idea of selecting what to invoke and when to invoke it. Commands are used pretty much every where, WPF is probably the most obvious place, because it actually have the notion of Command as a base class that you are supposed to be using.

Other variations, like encapsulating a bunch of code to be executed later (job / task), or just being able to isolate a complex behavior into its own object, is also very useful. I base quite a lot of my architectural advice on the notion that you can decompose a system to a series of commands that you can compose and shuffle at will.

Recommendation: Use it. Often. In fact, if you go so far as to say that the only reason we have classes is to have a nice vehicle for creating commands, you wouldn’t be going far enough.

Okay, I am kidding, but I really like this pattern, and it is a useful one quite often. The thing that you want to watch for are commands that are too granular. IncrementAgeCommand that is basically wrapping Age++ is probably too much, for example. Commands are supposed to be doing something meaningful from the scope of the entire application.

Production issue: ASP.Net Cache kills the application

In one of our production deployments, we occasionally get a complete server process crash. Investigating the event log, we have this:

Exception: System.InvalidOperationException

Message: Collection was modified; enumeration operation may not execute.

StackTrace:    at System.Collections.Generic.Dictionary`2.KeyCollection.Enumerator.MoveNext()

   at System.Web.Hosting.ObjectCacheHost.TrimCache(Int32 percent)

   at System.Web.Hosting.HostingEnvironment.TrimCache(Int32 percent)

   at System.Web.Hosting.HostingEnvironment.TrimCache(Int32 percent)

   at System.Web.Hosting.ApplicationManager.TrimCaches(Int32 percent)

   at System.Web.Hosting.CacheManager.CollectInfrequently(Int64 privateBytes)

   at System.Web.Hosting.CacheManager.PBytesMonitorThread(Object state)

   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean ignoreSyncCtx)

   at System.Threading._TimerCallback.PerformTimerCallback(Object state)

As you can see, this is a case of what appears to be a run of the mill race condition, translated to a process killing exception because it was thrown from a separate thread.

This thread, by the way, is the ASP.Net Cache cleanup thread, and we have no control whatsoever over that. To make things worse, this application doesn’t even use the ASP.NET Cache in any way shape or form.

Any ideas how to resolve this would be very welcome.


Published at

Originally posted at

Comments (33)

Design patterns in the test of time: Chain of responsibility

The chain-of-responsibility pattern is a design pattern consisting of a source of command objects and a series of processing objects. Each processing object contains logic that defines the types of command objects that it can handle; the rest are passed to the next processing object in the chain. A mechanism also exists for adding new processing objects to the end of this chain.

More about this pattern.

It is actually quite common to see this pattern now-a-days using events. Something like CancelEventArgs and a CancelEventHandler to handle the FormClosing event of a Form.

We use Chain of Responsibility in RavenDB in several places, like this one:

foreach (var requestResponderLazy in currentDatabase.Value.RequestResponders)
  var requestResponder = requestResponderLazy.Value;
  if (requestResponder.WillRespond(ctx))
    var sp = Stopwatch.StartNew();
    ctx.Response.AddHeader("Temp-Request-Time", sp.ElapsedMilliseconds.ToString("#,# ms", CultureInfo.InvariantCulture));
    return requestResponder.IsUserInterfaceRequest;

Note that we have moved on the the behavioral patterns, and those tend to have withstand the test of time much better, in general.

Other places where Chain of Responsibility is used is request routing and error handling. A common approach is to also have this done by delegating, where I am handling what I can and passing on to the next object if I don’t know how to handle a request.

Recommendation: This is still a very useful pattern. One thing to note is that it is effectively an O(N) operation with respect to the number of items in the chain that you have. As usual, do not overuse, but it is a really nice pattern.

Design patterns in the test of time: Proxy

A proxy, in its most general form, is a class functioning as an interface to something else. The proxy could interface to anything: a network connection, a large object in memory, a file, or some other resource that is expensive or impossible to duplicate.

More about this pattern.

Proxies are just about everywhere. Whenever you use NHibernate, WCF or Remoting – you are using proxies. In fact, proxies are such a success that they are literally baked into both the language and the platform. In .NET we have TransparentProxy and in Java has java.lang.reflect.Proxy.

At any rate, proxies are really useful, especially when you think about dynamic proxies. I am far less fond of static proxies, although we use them as well. Dynamic proxies are quite useful to add behavior, especially cross cutting behavior, at very little cost.

That said, one of the major issues that arises from using proxies is an inherit assumptions that the proxy is the same as its target. Commonly you see this happening with remote proxies, where it isn’t obvious that actually making the call is expensive as hell.

Proxies also tend to be used mostly for the infrastructure of your application, rather than for actual application code. In particular, business logic, rather than cross cutting concerns, is really hard to figure out / debug when you have it spread around in proxies.

Recommendation: Use it for infrastructure or cross cutting concerns. Try doing so using dynamic proxies, rather than by generating proxies by hand. Avoid putting business logic there and be aware that by using proxies you are hiding what is going on (that is pretty much the point), so doing non obvious things should be avoided.

Design patterns in the test of time: Flyweight

A flyweight is an object that minimizes memory use by sharing as much data as possible with other similar objects; it is a way to use objects in large numbers when a simple repeated representation would use an unacceptable amount of memory.

More about this pattern.

On the face of it, this patterns looks very much like something from the old ages. And indeed, most implementations of Flyweight are actually focused deeply on low memory conditions. I would actually argue that you need to consider very carefully when you want to do that.

That said, it is actually used fairly often in high performance places. In the .NET framework, the notion of string interning is one way to get flywieghts (although the problem is that you need to start with a string to get the intern string sort of mess things up). In both the profilers and in RavenDB, we have used variations on the Flyweight pattern.

In the profiler, we are mostly dealing with parsing data from the profiled system, and that means doing a lot of reading from a stream and creating objects. That created an unacceptable memory pressure on the system. We implemented a fairly complex system where we can read from the stream into a buffer, then get or create the string from it. We contributed the implementation back to the Protocol Buffers project. You can see the code here.

In RavenDB, we deal a lot with documents, and many times we find it useful to be caching a document. The problem with doing that is that you need to return something from the cache, which means that you have to return something mutable. Instead of copying all of the data all the time, the internal RavenDB data structures supports copy-on-write semantics, which means that we can easily create clones at basically no cost.

Recommendation: If you are in a perf optimization mode, and you worry about memory pressure, consider using this. Otherwise, like all optimizations, it should be left alone until you have profiler results that says you should consider this.

Midbook pattern crisis: Do NOT reach for that pattern

From a question in the mailing list:

How do you do to "rollback" an entity state supposing the following scenario.

public JsonResult Reschedule(string id, DateTime anotherDate)
                var dinner = session.Load<Dinner>(id);

                return Json(new { Message = "Bon apetit!" });
            catch (DinnerConcurrencyException ex) 
                return Json(new { Message = ex.Message });

// Base controller
protected void OnActionExecuted(ActionExecutedContext filterContext) 
      if(filterContext.Exception != null)

The problem that I have is when the schedulerService throws a DinnerConcurrencyException, it is catched at the controller.

After all, the OnActionExecuted will call SaveChanges and persist the dinner with an invalid state.

The next post was:

I have already tried to use Memento Pattern like this:

    var dinner = session.Load<Dinner>(id);

    return Json(new { Message = "Bon apetit!" });
catch (DinnerConcurrencyException ex) 
    dinner = dinner.RestoreState();
    return Json(new { Message = ex.Message });

But didn't work, beacuse I think Raven has proxied my dinner instance.

And here is the Memento implementation:

public abstract class Entity
    MemoryStream stream = new MemoryStream();

    public void SaveState()
        new BinaryFormatter().Serialize(stream, this);

    public T RestoreState<T>()
        stream.Seek(0, SeekOrigin.Begin);
        object o = new BinaryFormatter().Deserialize(stream);

        return (T)o;

You might notice that this create a new instance when you call RestoreState, and that has no impact whatsoever on the actual instance that is managed by RavenDB.

The suggested solution?

catch (DinnerConcurrencyException ex) 
    SkipCallingSaveChanges = true;
    return Json(new { Message = ex.Message });

No need for patterns here.

Midbook patterns crisis: Façade is and ain’t

I decided to comment to some of the comments in the blog in a full blown post, because they are quite important.

  • View Model is not a façade. It doesn’t allow any operation, and it doesn’t simplify things. It merely give you an easy access to aggregate information.
  • No, every class is not a Façade. I don’t care if “it provides simplified interface to a larger body of code”. If everything is a façade, then façade is meaningless.
  • No, Web Request is no a façade. Sure, it hides a lot of details about TCP, but that isn’t its point. Web Request implements HTTP spec, and that isn’t a façade at all.
  • Web Client / Tcp Client / Tcp Listener are facades – they drastically simplify how to work with sockets. Good facades, too, because when you need to do more, they don’t block you.
  • NHibernate ain’t a façade – it hides ADO.NET, sure, but it also does a lot more then just that.

The whole point of this series of posts was not to talk about patterns. If you want to do that, go ahead and read the books. My attempt is to go and talk about how the patterns are actually used. In practice, in real code.

Sure, in the real world, patterns aren’t used as per their text book definition. That doesn’t mean that we should look at the text book definition. That means that we need to look at how they are actually used! I am not writing a series of posts talking about the problems in the patterns, and that is important. I am writing about the problems in how people use them.

Design patterns in the test of time: Façade

A façade is an object that provides a simplified interface to a larger body of code, such as a class library.

More about this pattern.

The intent of the façade was good: to wrap APIs that are poorly designed in something shiny and nice.

In the real world, however, it is an evil pattern that is used to needlessly add abstractions for no particular reason. For example, look at this article.

Business Facade - Detailed View Data Access - Detailed View

That sound you just heard is your architecture, it is hiding in the closet, weeping silent tears about emotional and physical abuse that just doesn’t end.

I have yet to see a real case were façade was actually used properly. In most cases, people built a façade because That Is How We Do Things. And because of that, they ended up with things like the one above. It adds exactly nothing, and it horrifyingly complicates the code.

Recommendation: Avoid this, you really don’t need to do this most of the time, and most implementations are bad.

RavenDB Customers Stories

I just got this from a customer, and I think that it is a great story:

Hi Oren,

I thought you might be interested in a little success story!

At the beginning of the year we began pre-financing a pepper factory in the [redacted]. In order to control the money that we were sending to the factory I created a little Excel spread sheet to keep a record of the payments that we were making. However, some payments were made directly from our head office to the factory, and some were made to us in [redacted] which we in turn transferred to the factory. So I added a few more columns to the spread sheet to record this.

However, the repayment calculations are different depending on whether we make the payment in [redacted] or whether it comes from head office. So I added a few more columns to the spread sheet to calculate this. When the money arrives in [redacted], we need to sell the USD at the Central Bank and convert them into [redacted]. So I added a few more columns to the spread sheet to record this.

However, some of the payments that we are transferring to the factory are from money that we borrow locally in [redacted]. So I added a few more columns to record this.  The spread sheet is getting quite big now – we’re at about 30 columns and there’s a lot of horizontal scrolling going on. However, we’ve only done about 10 operations, so there are only 10 rows. No big deal if we have to scroll around a bit.

Now the factory has started to receive the product which we paid for in advance. Good idea if I add a column to record this. Shit – every five minutes, someone comes to me and asks me to add another column for something or other.

OK – these are all distinct operations. When the factory starts to ship out the goods, the first page of the spread sheet is at 45 columns, so I now create a new page in the spread sheet to control the shipments as we receive them. Hmm – another 30 columns appear – of which about a third of them are just references to the first page, and the rest new data.

Now I realize that it would be nice to control our margin on the entire operation. The spread sheet is shared on SkyDrive so that our offices in [redacted] and [redacted], our accountants, our lawyers, the factory and the freight forwarder can see what is going on. However, I can’t allow all of these people to see information about our margins, so I create a new hidden page in the spread sheet.

Boy are we successful – we’ve now done 50 operations. Can’t see the column or row headers any more. No worries, just freeze the columns/headers. Wait a minute – who the hell copied and pasted the formulas to the new rows? The numbers aren’t adding up any more. Duh – why did you put the number for the currency contract in against contract 1234 – it should have gone against 1235??

Now, I don’t want to touch the spread sheet any more for fear of breaking something.

Business is great. We’ve finally nailed how to run pre financing operations for pepper factories in [redacted]. We decide to expand our business, and start working with another factory. Since the data in the spread sheet has to be visible to the factory, I create a new spread sheet. However, some of the requirements are slightly different, so it has different columns to the first one.

Our spread sheet is now 85 columns and 60 rows, and the other one is 78 columns and 7 rows.  Since the second spread sheet “looks” almost identical to the first spread sheet – yup – you guessed it - someone puts information into the wrong spread sheet!

Crisis point – we’ve got about [redacted, but in the millions US$] out in the field, and we can’t work out where the hell the money is – are the goods with the farmer, are they at the factory, are they at the port, are they already on board the vessel,  has the vessel already left? Have we paid for them, if not, when do we need to pay for them?

All the data is in the spread sheet, but only one person in the entire company really understands it  (I just wrote it – I don’t put the data in). So if I want to know something it’s a scream over the desk – “Jonathan where the **** are the goods we were supposed to ship last week?” Jonathan goes on holiday for three weeks.

[redacted] pepper production stops, while Jonathan is going up the Eiffel Tower in Paris. Even at this point – if you can believe this – I was waiting for him to come back so that we could continue operations.

I’ve had enough of this – RavenDB to the rescue! Two – let me repeat that – TWO - hours later, I’ve written the scripts to read the spread sheet(s) and throw them into one document – I’ll repeat that again – ONE document. 

The document is nicely partitioned into logical groups – i.e. classes for Payments, Currency contracts, Shipments etc. There is also a separate class that only does the calculations between the different logical groups. But all of this is stored in one document only. Since the calculations are all obviously read only, the results get persisted to the document.

Two indexes which give me all the information I need to control where the money is, where the stocks are. Three days to write the front end – it only took that long because I’m still learning MVC and javascript (I really wanted to use Backbone, but just couldn’t get my head around it).

In the first analysis of the new application I found [redacted, hundreds of thousands US$] that we had been paid 30 days ago, that we had not sent to the factory, and was lying at the bank!

Now this is where it gets pretty amazing (for me at least). Our data is nicely partitioned, four pages that are simply different views onto the same data. The guys are getting all excited – hey I want to control whether we paid the lawyer etc. Someone notices that some of the calculations are not correct – so much for my edge case tests!

So I add a couple of new properties to the document, create a three line foreach loop to read all the data in the database and save it straight back again, and voila – I’ve “patched” the database. Migrations, shmigrations  - who needs them! All the calculations are updated and stored back in the database, the new properties all appear. And this is all on live data. The minute I’d got the raw data from the spread sheet into the database we stopped using Excel.

And that made my day.


Published at

Originally posted at

Comments (8)

Design patterns in the test of time: Decorator

The decorator pattern is a design pattern that allows behaviour to be added to an existing object dynamically.

More about this pattern.

I don’t have a lot to say about this pattern.

Hipster Ariel - Decorator? Used it before it was cool

The decorator pattern is still just as useful if not more so as the time it was initially announced. It forms a core part of many critical function of your day to day software.

The most obvious example is the notion of Stream, where you often decorate a stream (Buffered Stream, Compressing Stream, etc). This example is valid not just for the .NET framework, I can’t think of a single major framework that doesn’t use the Stream abstraction for its IO.

Then again, the cautionary side, people try to use it in… strange ways:

   1: public abstract class CondimentDecorator : Beverage {}
   3: public class Mocha : CondimentDecorator
   4: {
   5:     private Beverage m_beverage;
   7:     public Mocha(Beverage beverage)
   8:     {
   9:         this.m_beverage = beverage;
  10:     }
  12:     public override String Description
  13:     {
  14:         get
  15:         {
  16:             return m_beverage.Description + ", Mocha";
  17:         }
  18:     }
  20:     public override double Cost()               
  21:     {                                      
  22:         return 0.20 + m_beverage.Cost();
  23:     }
  24: }

Something like this, for example, is a pretty bad way of doing things. It assumes a very fixed model of looking at the universe, which is just not true. Decorator works best when you have just a single I/O channel. When you have multiple inputs & outputs, decorating something becomes much harder.

In particular, implementing business logic like the one above in decorators make it very hard to figure out why things are happening. In particular, CachingDecorator is something that you want to avoid (better to use infrastructure or a auto caching proxy, instead).

Recommendation: It is a very useful pattern, and should be used when you actually have one input / output channel, because that is a great way to allow to dynamically compose the way we apply processing to it.

Good indie Military Sci Fi

I recently began to read the “indies” in the Amazon’s listing, and I wanted to point out some things that I really enjoyed:

The David Birkenhead series:

Product Details Product Details Product Details Product Details

If I have one complaint about this series, it is that it the books are fairly short, typically less than 200 pages. That said, they are very well written, and the main protagonist is likable almost from the get go.  I just noticed that the latest (Commander) came out, and I read it in one sitting.

It is good, really good, military Sci Fi. It is believable, interesting and in general, a lot of fun.

The Admiral Who series:

Product Details Product Details

In contrast to the Birkenhead series, no one can complain that those books are short. Each comes at around 500 pages or so, and they are filled with a lot of really good content.

This is a reluctant hero story, and it reminded me strongly of Mat from the Wheel of Time, just in space.


The problem with most indie content on Amazon is that they are frequently poorly edited. In both cases, however, the editing is pretty good (not perfect, but good enough that it doesn’t distract from the story). And the stories more than compensate for that.

Just to give you some idea bout how goo they are, I am currently re-reading those books, and that is a honor that many professionally produced books just don’t get.

The Birkenhead series is supposed to have another book in late October (my thought, it is already late October, any later and it is November!) and while there isn’t a release date for the Admiral Who series, I am looking forward to both eagerly.


Published at

Originally posted at

Comments (8)

Design patterns in the test of time: Composite

The composite pattern describes that a group of objects are to be treated in the same way as a single instance of an object. The intent of a composite is to "compose" objects into tree structures to represent part-whole hierarchies. Implementing the composite pattern lets clients treat individual objects and compositions uniformly.

More on this pattern.

This post is supposed to come on the 1st of Nov or there about, which means that I’m going to do a small divergence into politics now. Here is an example of how the upcoming presidential elections in the states.

   1: public interface IVotingUnit
   2: {
   3:     int Weight { get;set; }
   4:     int CandidateId { get;set; }
   5: }
   7: public class Voter : IVotingUnit
   8: {
   9:     [Obsolete("Racist")]
  10:     string Id { get;set; }
  12:     int Weight { get;set; }
  14:     int CandidateId { get;set; }
  15: }
  17: public class WinnerTakesAllState : IVotingUnit
  18: {
  19:     string StateCode { get;set; }    
  21:     int Weight { get;set; }
  22:     int CandidateId { get;set; }
  24:     public void AddVote(Voter vote){
  25:         // calculate
  26:     }
  27: }

Yes, this is probably a bad example of this, but I find it hilarious.  The basic idea is that you can have an object that represent many other objects, but show the same external interface.

Composites are usually used for the more CS style of programming. Composites are very common in tree structures such as compiler AST or DOM. But beyond that, I can’t easily recall any real world usage of them in my applications. There are usually better alternatives, and you have to remember that when you speak about enterprise applications, loading / storing the data is just as important. And trying to implement the composite pattern under those circumstances will lead to a world of hurt.

Recommendation: If your entire dataset can easily fit in memory, and it make sense, go ahead. Most of the time, you probably should stay away.

Design patterns in the test of time: Bridge

The bridge pattern is a design pattern used in software engineering which is meant to "decouple an abstraction from its implementation so that the two can vary independently".The bridge uses encapsulation, aggregation, and can use inheritance to separate responsibilities into different classes.

More about this pattern.

Bridge is quite complex, mostly because it is composed of Abstraction, Refined Abstraction, Implementor and Concrete Implementor. Here is a concrete example (pun intended):

  • Abstraction: CImage
  • Refined Abstraction: CBmpImage, CJpegImage
  • Implementor: CimageImp
  • Concrete Implementor: CWinImp, COS2Imp

In general, I don’t like complex things, and of the top of my head, I can’t think of a time when I used this approach. Until such time when I can see a really good reason why I would want to do something like this, I see very little reason to bother.

Recommendation: Avoid, there are simpler options to solving those sort of problems.

Design patterns in the test of time: Adapter

In computer programming, the adapter pattern (often referred to as the wrapper pattern or simply a wrapper) is a design pattern that translates one interface for a class into a compatible interface.

More about this pattern.

This pattern is the first of the first of the Structural Patterns in the G04 book, and for the most part, it is meant to be used solely when you are integrating two or more external systems / libraries. When I saw it used inside a system, it almost always a Bad Thing, mostly because inside a single system, you want to use one of the other structural patterns.

HttpContextWrapper is a good example of using this pattern, and it links back nicely to the previous discussion on Singletons. HttpContext.Current was an issue, because it didn’t allow easy overriding / mocking / testing. The new HttpContextBase class was introduced, but due to backward compatibility concerns, HttpContext’s base class could not be changed. The solution, introduce HttpContextWrapper implementation that adapts between the two.

Useful pattern, but all too often people try to use it to abstract things, and that is most certainly not what it is meant for.

Recommendation: Use when you need to integrate with code that you can’t change (3rd party, legacy, compatibility concerns) using a common interface, but avoid otherwise.