You’ll pry transactions from my dead, cold, broken hands

time to read 6 min | 1153 words

“We tried using NoSQL, but we are moving to Relational Databases because they are easier…”

That was the gist of a conversation that I had with a client. I wasn’t quite sure what was going on there, so I invited myself to their offices and took a peek at the code. Their actual scenario is classified, so we will use the standard blog model to show a similar example. In this case, we have there entities, the BlogPost, the User and the Comment. What they wanted is to ensure is that when a user is commenting on a blog post, it will update the comments’ count on the blog post, update the posted comments count on the user and insert the new comment.

The catch was that they wanted the entire thing to be atomic, to either happen completely or not at all. The other catch was that they were using MongoDB. The code looked something like this:

public ActionResult AddComment(string postId, string userId, Comment comment)
{
    int state = 0;
    var blogPost = database.GetCollection<BlogPost>("BlogPosts").FindOneById(postId);
    var user = database.GetCollection<User>("Users").FindOneById(userId);
    try
    {
        database.GetCollection<Comment>("Comments").Save(comment);
        state = 1;

        blogPost.CommentsCount++;
        database.GetCollection<BlogPost>("BlogPosts").Save(blogPost);
        state = 2;

        user.PostecCommentsCount++;
        database.GetCollection<User>("Users").Save(user);
        state = 3;


        return Json(new {CommentAdded = true});
    }
    catch (Exception)
    {
         // if (state == 0) //nothing happened yet, don't need to do anything

        if (state >= 1)
        {
            database.GetCollection<Comment>("Comments").Remove(Query.EQ("_id", comment.Id), RemoveFlags.Single);
        }
        if (state >= 2)
        {
            blogPost.CommentsCount--;
            database.GetCollection<BlogPost>("BlogPosts").Save(blogPost);
        }
        if (state >= 3)
        {
            user.PostecCommentsCount--;
            database.GetCollection<User>("Users").Save(user);
        }

        throw;
    }
}

Take a moment or two to go over the code and figure out what was going on in there. It took me a while to really figure that one out.

Important, before I continue with this post, I feel that I need to explain what the problem is and why it is there. Put simply, MongoDB doesn’t support multi document transactions. The reason that it doesn’t support multi document transactions is that the way MongoDB auto sharding works, different documents may be on different shards, therefor requiring synchronization between different machines, which no one has managed to make scalable an efficient. MongoDB choose, for reasons of scalability and performance, to not implement this feature. This is document and well know part of the product.

It makes absolute sense, except that it leads to code like the one above, when the user really do want to have atomic multi document writes. Just to be certain that the point has been hammered home. The code above still does not ensures atomic multi document writes. For example, if the server shuts down between immediately after setting state to 2, there is nothing that the code can do to revert the previous writes (after all, they can’t contact the server to tell it that it to revert them).

And there are other problems with this approach, the code is ugly, and it is extremely brittle. It is very easy to update one part and not the other… but at this point I think that I am busy explaining why horse excrement isn’t suitable for gourmet food.

The major problem with this code is that it is trying to do something that the underlying database doesn’t support. I sat down with the customer and talked about the advantages and disadvantages of staying with a document database vs. moving to a relational database. A relational database would handle atomic multi row writes easily, but would require many reads and many joins to show a single page.

That was the point where I put the disclaimer “I am speaking about my own product, and I am likely biased, be aware of that”.

The same code in RavenDB would be:

public ActionResult AddComment(string postId, string userId, Comment comment)
{
    using(var session = documentStore.OpenSession())
    {
        session.Save(comment);
        session.Load<BlogPost>(postId).CommentsCount++;
        session.Load<User>(userId).PostedCommentCount++;

        session.SaveChanges(); // Atomic, either all are saved or none are
    }
    
    return Json(new { CommentAdded = true });

}

There are a couple of things to note here:

  • RavenDB supports atomic multi document writes without anything required.
  • This isn’t the best RavenDB code, ideally I wouldn’t have to create the session here, but in the infrastructure, but you get the point.

We also support change tracking for loaded entities, so we didn’t even need to tell it to save the loaded instances. All in all, I also think that the code is prettier, easier to follow and would produce correct results in the case of an error.