Ayende @ Rahien

It's a girl

What is RavenDB’s competitive advantage?

Originally posted at 11/27/2010

Put simply, is is not this:

The main idea is that RavenDB is a NoSQL database that aims unabashedly to be something that you can just pick up and use, and the database will take care of things like that for you, because for the most part, you can trust the defaults.

That range from having a good story for the Client API, having a Linq Provider, making sensible defaults and working very hard on making sure that everything fits (things like ad hoc queries, for example).

RavenDB Multi Tenancy

Originally posted at 11/27/2010

One of the features that people asked for RavenDB is the notion of multi tenancy. The idea is that I can (easily) create a new database (there is the RavenDB Server, and the server contains Databases, which contain indexes and documents) for each tenant. As you can imagine from the feature name, while it is actually an implementation of several databases on the same server, the usage goal is to have this for different tenants. As such, the RavenDB implementation is aimed to handle that exact scenario.

As such, it is expected that some users will have hundreds / thousands of databases on a single instance (think shared hosting or hosted raven). In order to support this, we need to have a good handle on our resources.

RavenDB handles this scenario by loading databases on demand, and unloading them again when they aren’t used for a long enough period. Using this approach, the only databases that consume resources are the ones actually being used.

In order to use the multi tenancy features, all you need to do is call:

documentStore.DatabaseCommands.EnsureDatabaseExists("Northwind");

var northwindSession = documentStore.OpenSession("Northwind");

You can now work in a separate database from all other databases, with no potential for data to leak from one tenant to another. Creating a database on the fly s a very cheap operation, as well, so you can create as many of them as you want.

The least evil choice

Originally posted at 11/27/2010

A while ago I had to make a decision regarding how to approach building the multi tenancy feature for RavenDB. Leaving aside the actual multi tenancy approach, we had an issue in how to allow access to it.

We have the following options for accessing the northwind database:

  • /northwind/docs - breaking change, obvious, less work
  • /docs?database=northwind - non breaking change, not so obvious, more work

What choice would you take?

If you know what is the choice I made for RavenDB, please don’t answer this post.

What is Uber Prof’s competitive advantage?

Originally posted at 11/25/2010

In a recent post, I discussed the notion of competitive advantage and how you should play around them. In this post, I am going to focus on Uber Prof. Just to clarify, when I am talking about Uber Prof, I am talking about NHibernate Profiler, Entity Framework Profiler, Linq to SQL Profiler, Hibernate Profiler and LLBLGen Profiler. Uber Prof is just a handle for me to use to talk about each of those.

So, what is the major competitive advantage that I see in the Uber Prof line of products?

Put very simply, they focus very heavily on the developer’s point of view.

Other profilers will give you the SQL that is being executed, but Uber Prof will show you the SQL and:

  • Format that SQL in a way that make it easy to read.
  • Group the SQL statements into sessions. Which let the developer look at what is going on in the natural boundary.
  • Associate each query with the exact line of code that executed it.
  • Provide the developer with guidance about improving their code.

There are other stuff, of course, but those are the core features that make Uber Prof into what it is.

The smallest bugs, the biggest problems – Part II

Originally posted at 11/22/2010

In a previous post, I talked about how I found the following (really nasty) bug in RavenDB’s managed storage (which is still considered unstable, btw):

When deleting documents in a database that contains more than 2 documents, and  the document(s) deleted are deleted in a certain order, RavenDB would go into 100% CPU. The server would still function, but it would always think that it had work to do, even if it didn’t have any.

Now, I want to talk about the actual bug.

image

What I did wrong here is to reuse the removed and value parameters in the second call to TryRemove. That call is internal, and is only needed to properly balance the tree, but what it ended up doing is always return the removed/value from the right side of the tree.

Compounding the problem is that I only actually used the TryRemove value in a single location, and even then, it is a mistake. Take a look:

image

That meant that I actually looked for the problem in the secondary indexes for a while, before realizing that the actual problem was elsewhere.

Tags:

Published at

Originally posted at

Comments (5)

Find the bug: A broken tree

Originally posted at 11/22/2010

This method has a bug, a very subtle one. Can you figure it out?

public IBinarySearchTree<TKey, TValue> TryRemove(TKey key, out bool removed, out TValue value)
{
    IBinarySearchTree<TKey, TValue> result;
    int compare = comparer.Compare(key, theKey);
    if (compare == 0)
    {
        removed = true;
        value = theValue;
        // We have a match. If this is a leaf, just remove it 
        // by returning Empty.  If we have only one child,
        // replace the node with the child.
        if (Right.IsEmpty && Left.IsEmpty)
            result = new EmptyAVLTree<TKey, TValue>(comparer, deepCopyKey, deepCopyValue);
        else if (Right.IsEmpty && !Left.IsEmpty)
            result = Left;
        else if (!Right.IsEmpty && Left.IsEmpty)
            result = Right;
        else
        {
            // We have two children. Remove the next-highest node and replace
            // this node with it.
            IBinarySearchTree<TKey, TValue> successor = Right;
            while (!successor.Left.IsEmpty)
                successor = successor.Left;
            result = new AVLTree<TKey, TValue>(comparer, deepCopyKey, deepCopyValue, successor.Key, 
                successor.Value, Left, Right.TryRemove(successor.Key, out removed, out value));
        }
    }
    else if (compare < 0)
        result = new AVLTree<TKey, TValue>(comparer, deepCopyKey, deepCopyValue, 
            theKey, theValue, Left.TryRemove(key, out removed, out value), Right);
    else
        result = new AVLTree<TKey, TValue>(comparer, deepCopyKey, deepCopyValue,
            theKey, theValue, Left, Right.TryRemove(key, out removed, out value));
    return MakeBalanced(result);
}

Published at

Originally posted at

Comments (12)

Dallas Days of .NET – March 4-5

I am going to be at Dallas Days of .NET on March next year.  You can use the following link to get a discount if you order now: http://jointechies.eventbrite.com/?discount=OrenEini

This is going to be an interesting event, because there is one track in which I am going to be doing every other talk for 2 days. This is going to give me a wide enough scope to cover just about every topic that I am interested at, including some time to go in depth into several topics that I usually have the chance to only skim.

Tags:

Published at

Your design should be focused on your competitive advantage

Yesterday I had an interesting talk with a friend about being a Micro ISV. I am not sure how good a source I am for advice in the matter, but I did have some. Including one that I think is good enough to talk about here.

Currently my company have two products:

Both of them came into a market that already had strong competitors.

In the case of Uber Prof, I am competing with SQL Profiler, which is “free” (you get that with SQL Server) and the Huagati suite of profilers which are significantly cheaper than Uber Prof. In the case of RavenDB, MongoDB and CouchDB already had the mindshare, and they are both free as in beer and as in speech.

One of the decisions that you have to be aware of when creating your product is what are the products that people are going to compare you to. It doesn’t really matter whether that is an accurate comparison or whether they are comparing apples to camels, but you will be compared to them.

And early on, you have to decide what your answer is going to be like when someone will ask you “why should I use your stuff instead of XYZ?”.

Here is a general rule of the thumb. You never want to answer them “because it is cheaper than XYZ”. Pricing has a lot of implications, some of directly affect the perceived quality of your product. It is perfectly fine to point out that it has a much cheaper TCO, though, because then you are increasing the value of your product, not reducing it.

But those are general advices that you can get anywhere. My point here is somewhat different. Once you decide what you are doing with your product that gives you a good answer for that question, you have defined your competitive advantage. That thing that will make people choose your stuff over everyone else.

Remember, competing on pricing is a losing proposition – and the pun is fully intended here!

But once you have the notion of what your competitive advantage is going to be, you have to design your product around that. In essence, that competitive advantage is going to be the thing that you are going to work on. Every decision that you have is going to have to be judged in light of the goal of increasing your competitive advantage.

Can you try to guess what I define as competitive advantages for Uber Prof and Raven DB?

Enough is enough: iTunes got to go

Here is the story, the only reason that I am using iTunes is because I want to sync books that I buy from audible.com to my iPhone.

I am still fighting this problem. And I have installed / uninstalled, danced the mamba and even try some chicken sacrifice on the last full moon. Nothing helps, oh, it will works once, immediately after I install it, but on the next reboot, it will show the same error.

Right now I have uninstalled iTunes from my system, and I am currently building a VM specifically so I would be able to sync new audiobooks to my iPhones. I think that this is insane.

Anyone got a better option than that?

The smallest bugs, the biggest problems – Part I

We had the following (really nasty) bug in RavenDB’s managed storage (which is still considered unstable, btw):

When deleting documents in a database that contains more than 2 documents, and  the document(s) deleted are deleted in a certain order, RavenDB would go into 100% CPU. The server would still function, but it would always think that it had work to do, even if it didn’t have any.

To call this annoying is an understatement. To understand the bug I have to explain a bit about how RavenDB’s uses Munin, the managed storage engine. Munin gives you the notion of a primary key (which can be any json tuple) and secondary indexes. As expected, the PK is unique, but secondary indexes can contain duplicate values.

The problem that we had was that for some reason, removing values from the table wouldn’t remove them from the secondary indexes. That drove me crazy. At first, I tried to debug the problem by running the following unit test:

public class CanHandleDocumentRemoval : LocalClientTest
{
    [Fact]
    public void CanHandleDocumentDeletion()
    {
        using(var store = NewDocumentStore())
        {
            using(var session = store.OpenSession())
            {
                for (int i = 0; i < 3; i++)
                {
                    session.Store(new User
                    {
                        Name = "ayende"
                    });
                }
                session.SaveChanges();
            }
         
            using (var session = store.OpenSession())
            {
                var users = session.Query<User>("Raven/DocumentsByEntityName")
                    .Customize(x => x.WaitForNonStaleResults())
                    .ToArray();
                Assert.NotEmpty(users);
                foreach (var user in users)
                {
                    session.Delete(user);
                }
                session.SaveChanges();
            }
           
            using (var session = store.OpenSession())
            {
                var users = session.Query<User>("Raven/DocumentsByEntityName")
                    .Customize(x => x.WaitForNonStaleResults(TimeSpan.FromSeconds(5)))
                    .ToArray();
                Assert.Empty(users);
            }
        }
    }
}

But, while this reproduced the problem, it was very hard to debug properly. Mostly, because this executes the entire RavenDB server, which means that I had to deal with such things as concurrency, multiple operations, etc.

After a while, it became clear that I wouldn’t be able to understand the root cause of the problem from that test, so I decided to take a different route. I started to add logging in the places where I thought the problem was, and then I turned that log into a test all of its own.

[Fact]
public void CanProperlyHandleDeletingThreeItemsBothFromPK_And_SecondaryIndexes()
{
    var cmds = new[]
    {
        @"{""Cmd"":""Put"",""Key"":{""index"":""Raven/DocumentsByEntityName"",""id"":""AAAAAAAAAAEAAAAAAAAABQ=="",""time"":""\/Date(1290420997504)\/"",
""type"":""Raven.Database.Tasks.RemoveFromIndexTask"",""mergable"":true},""TableId"":9,""TxId"":""NiAAMOT72EC/We7rnZS/Fw==""}", @"{""Cmd"":""Put"",""Key"":{""index"":""Raven/DocumentsByEntityName"",""id"":""AAAAAAAAAAEAAAAAAAAABg=="",""time"":""\/Date(1290420997509)\/"",
"
"type"":""Raven.Database.Tasks.RemoveFromIndexTask"",""mergable"":true},""TableId"":9,""TxId"":""NiAAMOT72EC/We7rnZS/Fw==""}", @"{""Cmd"":""Put"",""Key"":{""index"":""Raven/DocumentsByEntityName"",""id"":""AAAAAAAAAAEAAAAAAAAABw=="",""time"":""\/Date(1290420997509)\/"",
"
"type"":""Raven.Database.Tasks.RemoveFromIndexTask"",""mergable"":true},""TableId"":9,""TxId"":""NiAAMOT72EC/We7rnZS/Fw==""}", @"{""Cmd"":""Commit"",""TableId"":9,""TxId"":""NiAAMOT72EC/We7rnZS/Fw==""}", @"{""Cmd"":""Del"",""Key"":{""index"":""Raven/DocumentsByEntityName"",""id"":""AAAAAAAAAAEAAAAAAAAABg=="",""time"":""\/Date(1290420997509)\/"",
"
"type"":""Raven.Database.Tasks.RemoveFromIndexTask"",""mergable"":true},""TableId"":9,""TxId"":""wM3q3VA0XkWecl5WBr9Cfw==""}", @"{""Cmd"":""Del"",""Key"":{""index"":""Raven/DocumentsByEntityName"",""id"":""AAAAAAAAAAEAAAAAAAAABw=="",""time"":""\/Date(1290420997509)\/"",
"
"type"":""Raven.Database.Tasks.RemoveFromIndexTask"",""mergable"":true},""TableId"":9,""TxId"":""wM3q3VA0XkWecl5WBr9Cfw==""}", @"{""Cmd"":""Del"",""Key"":{""index"":""Raven/DocumentsByEntityName"",""id"":""AAAAAAAAAAEAAAAAAAAABQ=="",""time"":""\/Date(1290420997504)\/"",
"
"type"":""Raven.Database.Tasks.RemoveFromIndexTask"",""mergable"":true},""TableId"":9,""TxId"":""wM3q3VA0XkWecl5WBr9Cfw==""}", @"{""Cmd"":""Commit"",""TableId"":9,""TxId"":""wM3q3VA0XkWecl5WBr9Cfw==""}", }; var tableStorage = new TableStorage(new MemoryPersistentSource()); foreach (var cmdText in cmds) { var command = JObject.Parse(cmdText); var tblId = command.Value<int>("TableId"); var table = tableStorage.Tables[tblId]; var txId = new Guid(Convert.FromBase64String(command.Value<string>("TxId"))); var key = command["Key"] as JObject; if (key != null) { foreach (var property in key.Properties()) { if(property.Value.Type != JTokenType.String) continue; var value = property.Value.Value<string>(); if (value.EndsWith("==") == false) continue; key[property.Name] = Convert.FromBase64String(value); } } switch (command.Value<string>("Cmd")) { case "Put": table.Put(command["Key"], new byte[] {1, 2, 3}, txId); break; case "Del": table.Remove(command["Key"], txId); break; case "Commit": table.CompleteCommit(txId); break; } } Assert.Empty(tableStorage.Tasks); Assert.Null(tableStorage.Tasks["ByIndexAndTime"].LastOrDefault()); }

The cmds variable that you see here was generated from the logs. What I did was generate the whole log, verify that this reproduce the bug, and then start trimming the commands until I had the minimal set that reproduced it.

Using this approach, I was able to narrow the actual issue to a small set of API, which I was then able to go through in detail, and finally figure out what the bug was.

This post isn’t about the bug (I’ll cover that in the next post), but about the idea of going from a “there is a bug and I don’t know how to reproduce it in to a small enough step to understand” to “here is the exact things that fail”. A more sophisticated approach would be to do a dump of stack traces and parameters and execute that, but for my scenario, it was easy to just construct things from the log.

Better to ask forgiveness…

Originally posted at 11/19/2010

I am writing this post from the point of view of the one you might have to ask forgiveness from, not the one who has to ask forgiveness.

There are two common styles of building software, one of them is Bold Exploration, and the second is Detailed Survey.

Bold Exploration have a rough idea about what is required, and immediately set out to build something that matches the current understanding. Along the way, both the team and the client will continually triangulate toward where they want to be. This is in effect the old “I’ll know it when I see it”. The problem from the point of view of the client (as in, me), is that I might be required to pay for those mistakes. For the most part, I don’t mind too much. One of the best aspects of this approach is that the time to get a feature in my hands is very short. I had a few memorable cases where it took the team three or four days to get something out that I could immediately comment on. There is another very important aspect. For the most part, this approach means that from my perspective, this is fire & forget. The way that I handle such things is usually by just having a phone call or two with the team, and then just getting their output and going over that, correcting the course if needed.

There were several times that mistakes where made, either because I didn’t explain right, a difference in vision or just simply because the team didn’t do things the way I wanted to. I asked for those to be fixed, and since we were operating on short time frames anyway, it didn’t cost too much (and I paid for both the mistake and the fixing of it).

Detail Survey involves the same initial phone calls, but then the team goes away to plan, estimate and build a plan for building the project. They are very careful about the way that they are going about doing things, and get back to me multiple times asking for clarification for this or that. When they start building the project, they start by building the foundations, getting things to work from the bottom up.  In many aspects, this is a very reasonable way to build software, expect for one very big problem. There is a long lead time until I get something that I can actually something with.

More importantly, this is me you are talking about, I am perfectly capable of looking at the code produce and estimate what they are doing. But, as impressive as a design is, it isn’t really interesting to me at all. I want to see the end features. And this approach takes a long time until I have something that I can actually work with.

I experienced both options recently, and I have to say that my preference is strongly toward the first approach. If pressed, I could give you a whole lot of reasons why I want that, valid business reasons, enough to convince any CTO or CFO. But mostly, it is an issue of matching expectations. When someone goes off for a month or two to do stuff, I get very nervous. And if you start building from the bottom up, it gets progressively harder to truly evaluate what is going on. When you get a working feature in the first week, even if additional work would cause rewrites/changes to this feature, you can be much more confident that this is going somewhere you want.

The other side is that no matter how careful the team is, there are still going to be differences between the final product and what I want, so changes will be made, but it is far more likely that with the Detailed Survey team aren’t used to those changes, and they are going to be harder to make.

In short, it all comes back to the fact that the shorter the feedback cycle is, the more happy I am.

Better to ask forgiveness…

Originally posted at 11/19/2010

I am writing this post from the point of view of the one you might have to ask forgiveness from, not the one who has to ask forgiveness.

There are two common styles of building software, one of them is Bold Exploration, and the second is Detailed Survey.

Bold Exploration have a rough idea about what is required, and immediately set out to build something that matches the current understanding. Along the way, both the team and the client will continually triangulate toward where they want to be. This is in effect the old “I’ll know it when I see it”. The problem from the point of view of the client (as in, me), is that I might be required to pay for those mistakes. For the most part, I don’t mind too much. One of the best aspects of this approach is that the time to get a feature in my hands is very short. I had a few memorable cases where it took the team three or four days to get something out that I could immediately comment on. There is another very important aspect. For the most part, this approach means that from my perspective, this is fire & forget. The way that I handle such things is usually by just having a phone call or two with the team, and then just getting their output and going over that, correcting the course if needed.

There were several times that mistakes where made, either because I didn’t explain right, a difference in vision or just simply because the team didn’t do things the way I wanted to. I asked for those to be fixed, and since we were operating on short time frames anyway, it didn’t cost too much (and I paid for both the mistake and the fixing of it).

Detail Survey involves the same initial phone calls, but then the team goes away to plan, estimate and build a plan for building the project. They are very careful about the way that they are going about doing things, and get back to me multiple times asking for clarification for this or that. When they start building the project, they start by building the foundations, getting things to work from the bottom up.  In many aspects, this is a very reasonable way to build software, expect for one very big problem. There is a long lead time until I get something that I can actually something with.

More importantly, this is me you are talking about, I am perfectly capable of looking at the code produce and estimate what they are doing. But, as impressive as a design is, it isn’t really interesting to me at all. I want to see the end features. And this approach takes a long time until I have something that I can actually work with.

I experienced both options recently, and I have to say that my preference is strongly toward the first approach. If pressed, I could give you a whole lot of reasons why I want that, valid business reasons, enough to convince any CTO or CFO. But mostly, it is an issue of matching expectations. When someone goes off for a month or two to do stuff, I get very nervous. And if you start building from the bottom up, it gets progressively harder to truly evaluate what is going on. When you get a working feature in the first week, even if additional work would cause rewrites/changes to this feature, you can be much more confident that this is going somewhere you want.

The other side is that no matter how careful the team is, there are still going to be differences between the final product and what I want, so changes will be made, but it is far more likely that with the Detailed Survey team aren’t used to those changes, and they are going to be harder to make.

In short, it all comes back to the fact that the shorter the feedback cycle is, the more happy I am.

Raven Suggest

Originally posted at 11/18/2010

image Since RavenDB’s indexing is based on Lucene, we get a lot of advantages. One of them is supporting suggestions.

What is the Raven Suggest feature?  Let us say that we give the user the option to perform some query, but the query that they sent us isn’t returning enough results (or non at all).

We can ask RavenDB to figure out what the user meant. Here is the code:

 var q = from user in session.Query<User>("Users/ByName")
         where user.Name == name
         select user;

Now, we can call q.FirstOrDefault() on this query, and see if we got any results. If we didn’t get a result, we can now ask RavenDB for help:

 var suggestionQueryResult = q.Suggest();
 Console.WriteLine("Did you mean?");
 foreach (var suggestion in suggestionQueryResult.Suggestions)
 {
     Console.WriteLine("\t{0}", suggestion);
 }

This will produce the following output:

Enter user name: oren
Found user: users/1
Enter user name: ore
Did you mean?
        oren
Enter user name: onre
Did you mean?
        oren

This is a nice little feature, which can really make a difference in terms of your application’s user experience.

NHibernate: Complex relationships

Originally posted at 11/18/2010

I got an interesting question today (I am teaching my NHibernate course now).

The tabular structure is similar to this:

image

But the desired object structure is:

image

That is quite different than the tabular model, but it is actually very easy to handle this with NHibernate.

Here are the mapping for the Address entity. We use the <join/> tag to have an entity that spans more than a single table:

<class name="Address"
       table="Addresses">
  <id name="Id">
    <generator class="identity"/>
  </id>
  <property name="City" />

  <join table="PeopleAddresses" >
    <key column="AddressId"/>
    <property name="IsDefault"/>
    <property name="ValidFrom"/>
    <property name="ValidTo"/>
  </join>

</class>

We then map the Person, using standard many-to-many mapping for the addresses:

 <class name="Person"
             table="People">

   <id name="Id">
     <generator class="identity"/>
   </id>
   <property name="Name" />

   <bag name="Addresses" table="PeopleAddresses" inverse="true">
     <key column="PersonId"/>
     <many-to-many class="Address" column="AddressId"/>
   </bag>
   
 </class>

There is just one thing thing to be aware of, you can’t add new addresses via the Person.Addresses collection, because the PeopleAddresses table has more data in it than just the keys. Presumably, you are handling this in some other fashion already.

All in all, this is a pretty elegant solution.

Raven MQ – The Real API

Originally posted at 11/17/2010

This is now a passing test…

public class UntypedMessages : IDisposable
{
    private readonly RavenMqServer ravenMqServer;
    private readonly RavenConfiguration configuration;

    public UntypedMessages()
    {
        configuration = new RavenConfiguration
        {
            RunInMemory = true,
            AnonymousUserAccessMode = AnonymousUserAccessMode.All
        };
        ravenMqServer = new RavenMqServer(configuration);
    }

    [Fact]
    public void Can_get_message_from_client_connection()
    {
        using(var connection = new RavenMQConnection(
            new Uri(configuration.ServerUrl), 
            new IPEndPoint(IPAddress.Loopback, 8181)))
        {
            var manualResetEventSlim = new ManualResetEventSlim(false);
            OutgoingMessage msg = null;
            connection.Subscribe("/queues/abc", (context, message) =>
            {
                msg = message;
                manualResetEventSlim.Set();
            });

            WaitForSubscription();

            ravenMqServer.Queues.Enqueue(new IncomingMessage
            {
                Data = new byte[] {1, 2, 3},
                Queue = "/queues/abc"
            });

            manualResetEventSlim.Wait();

            Assert.Equal(new byte[]{1,2,3}, msg.Data);
        }
    }

    private void WaitForSubscription()
    {
        // not important
    }

    public void Dispose()
    {
        ravenMqServer.Dispose();
    }
}

Unlike the previous posts, which were more design and up front, this post shows working code. I am still not completely happy witht his, mostly because of the RavenMQConnection ctor parameters, but I can live with this for now.

Where is the bug?

Originally posted at 11/16/2010
var readLine = Console.ReadLine() ?? "";
switch (readLine.ToLowerInvariant())
{
    case "CLS":
        Console.Clear();
        break;
    case "reset":
        Console.Clear();
        return true;
    default:
        return false;
}

TPL: Composing tasks

What happens when you want to compose two distinct async operations into a single Task?

For example, let us imagine that we want to have a method that looks like this:

public void ConnectToServer()
{
    var connection = ServerConnection.CreateServerConnection(); // tcp connect
    connection.HandShakeWithServer(); // application level handshake
}

Now, we want to make this method async, using TPL. We can do this by changing the methods to return Task, so the API we have now is:

Task<ServerConnection> CreateServerConnectionAsync();
Task HandShakeWithServerAsync(); // instance method on ServerConnection

And we can now write the code like this:

public Task ConnectToServerAsync()
{
   return ServerConnection.CreateServerConnectionAsymc()
                 .ContinueWith(task => task.Result.HandShakeWithServerAsync());
}

There is just one problem with this approach, the task that we are returning is the first task, because the second task cannot be called as a chained ContinueWith.

We are actually returning a Task<Task>, so we can use task.Result.Result to wait for the final operation, but that seems like a very awkward API.

The challenge is figuring a way to compose those two operations in a way that expose only a single task.

Aysnc Read Challenge

Originally posted at 11/11/2010

Here is how you are supposed to read from a stream:

var buffer = new byte[bufferSize];
int read = -1;
int start = 0;
while(read != 0)
{
   read = stream.Read(buffer, start, buffer.Length - start);
   start += read;
}

The reason that we do it this way is that the stream might not have all the data available for us, and might break the read requests midway.

The question is how to do this in an asynchronous manner. Asynchronous loops are… tough, to say the least. Mostly because you have to handle the state explicitly.

Here is how you can do this using the new Async CTP in C# 5.0:

private async static Task<Tuple<byte[], int>> ReadBuffer(Stream s, int bufferSize)
{
    var buffer = new byte[bufferSize];
    int read = -1;
    int start = 0;
    while (read != 0)
    {
        read = await Task.Factory.FromAsync<byte[],int,int, int>(
            s.BeginRead, 
            s.EndRead, 
            buffer, 
            start, 
            buffer.Length -1, 
            null);
        start += read;
    }
    return Tuple.Create(buffer, start);
}

Now, what I want to see is using just the TPL API, and without C# 5.0 features, can you write the same thing?

What is wrong with this API? Answer

Originally posted at 11/11/2010

The reason that I don’t like this API is that I think it provides a piss poor usability:

image

In short, it doesn’t provide me with any reasonable way to tell the user why I rejected a message.

Here is the proper way of doing this:

image

MessageVeto also contains a reason string, which provide a much better experience all around.

What is wrong with this API?

Originally posted at 11/11/2010

This is part of a base class, and we need to provide a way to veto messages for extensions.

image

Can you figure out what is wrong with this API?

Raven MQ – Client API Design

There are only two topics that remains in the Raven MQ server (replication & point to point messaging), but I decided to stop for a while and focus on the client API. My experience have shown that it is so much more important than anything else to gain acceptance for the project.

One thing that I want to make clear is that this is the high level API, which has very little to do with how this is actually implemented.

The first thing to be aware of is that Raven MQ is transactional. That is, all operations either complete successfully or fail as a single unit. That makes it very easy to work with it for a set of scenarios. It is not an accident that the API is very similar to the one that you get from Rhino Service Bus or NServiceBus, although Raven MQ client API is drastically more modest in what it is trying to do.

Getting started:

var raveMQEndpoint = new RavenMQEndpoint
{
    Url = "http://localhost:8181"
};
raveMQEndpoint.Start();

Subscribing (methods):

raveMQEndpoint.Subscribe("/streams/system/notifications", (ctx, untypedMsg) =>
{
    // do something with the msg
});

raveMQEndpoint.Subscribe<LoginAboutToExpire>("/streams/user/1234", (ctx, msg) =>
{
    // do something with the msg
});

raveMQEndpoint.Subscribe<LoginExpired>("/streams/user/1234", (ctx, msg) =>
{
    // do something with the msg
});

This allows you to handle untyped messaged, or to select specific types of messages that will be handled from the stream (ignoring messages not of this type). I’ll discuss the ctx parameter at a later stage, for now, you can ignore it. What you can’t see here is that the Subscribe methods here returns an IDisposable instance, which allows you to remove the subscription. Useful for temporary subscriptions, which is something that is pretty common for the scenarios that we see Raven MQ used for.

Subscribing (classes):

raveMQEndpoint.Subscribe("/streams/user/1234", () => new LoginExpiredConsumer());

raveMQEndpoint.Subscribe("/streams/user/1234", mefContainer);

Instead of registering a single method, you can register a factory method, or a MEF container, both of which will create a consumer class for handling the messages.

Serialization:

Raven MQ doesn’t care about the serialization format, you can it messages using whatever format you like, but the client API used JSON/BSON to store the data.

Sending messages:

Remember that I talked about the ctx parameter? The RavenMQEndpoint doesn’t offer a Send() method, that is handled by the ctx paratemer, which stands for Context, obviously. The idea is quite simple, we want to make message sending transactional, so we always use a context to send them, and only if the context completed successfully can we truly consume the message and send all the messages to the server. You can think of the Context as the Raven MQ transaction.

For sending messages outside of processing an existing message, you can use:

ravenMQEndpoint.Transaction(ctx=> ctx.Send("/queues/customers/1234", updateCustomerAddress));

This gives us a very easy way of scoping multiple messages in a single transaction without awkward APIs.

Thoughts?

Raven MQ – Principles

Originally posted at 11/9/2010

Raven MQ is a new project that I am working on. As you can guess from the name, this is a queuing system, but it is a queuing system with a few twists.  I already wrote a queuing system in the past (Rhino Queues), why write another one?

Raven MQ builds upon the experience in building Rhino Queues, but it also targets a different set of usage scenarios. Like Rhino Queues, Raven MQ can be xcopy deployed, but it is not usually used in a traditional point to point messaging system. Instead, Raven MQ is a queuing system for the web. What do I mean by that? Raven MQ has a different set of design decisions, focused on making some things that are traditionally expensive in queuing systems cheap:

  • Unlike in most queuing systems, queues are cheap. That allows you to create an unlimited amount of queues. Typical deployment of Raven MQ will have at least one queue per client.
  • Which leads to the next point, Raven MQ is designed to support literally thousands of clients.

The model isn’t the traditional queuing one you might be familiar with from MSMQ:

image

Instead, the model uses a central server to hold all the information:

image

 

The reasoning behind this is actually pretty simple. Unlike in traditional queuing systems, where we have a node of the queuing system running on each end point, Raven MQ makes the assumption that most of the clients connect to it are actually web clients, using JavaScript on the page or maybe Silverlight applications.

The decision to directly support those clients is what makes Raven MQ unique.

Transport models

Raven MQ offers two distinct models for transporting messages. The first is the traditional queue model, where each message can only be consumed by a single consumer. This is not a very interesting model.

A much more interesting model is the message stream. A message stream in Raven MQ is a set of messages sent to a particular queue. But unlike a queue, reading a message from the stream does not consume it. That means that multiple consumers can read the messages on the stream. Moreover, clients that arrive after the message was sent can still read the message (as long as its time to live is in effect).

Usage model

The previous section is probably hard to understand. As usual, an example will makes all the difference in the world.

Let us imagine that we are building a CRM system, and we are currently viewing a customer screen. At that point, we are subscribe to the following streams:

  • /streams/system/notifications – Global system notifications
  • /streams/customers/1234 – Updates about customer 1234
  • /streams/users/4321 – Updates about our logged on user

And the following queue:

  • /queues/mailboxes/1234 – Replies to our particular client

The idea is pretty simple, actually. When we read the customer data, we are loading it from the view model store, but we also need to be able to efficiently get updates about changes that happen to the customer when we are looking at it. We are doing that by subscribing to the appropriate stream. Another user who is also looking at the same user is also subscribed to the same stream. Even more importantly, a user that opened the customer after some changes have been made (but before they were written to the view model store) will also get those updates, and will be able to reconstruct the current state in an seamless manner.

This approach drastically simplifies the update problem in complex systems.

Why call them streams and not topics?

Topics are a routing mechanism, but with Raven MQ, streams aren’t used for routing. They are used to hold a set of messages, that is all. The problem with routing is that you can’t join up later and receive previously sent messages, and (much worse) you can’t really use routing on the web, because when you have potentially thousands of clients, all coming & going at will, you can’t setup a queue for each of them, it is too expensive.

The stream/notification model solve that problem rather neatly, even if I say so myself.

What I did not discussed?

Please note that I am discussing the system at a very high level right now. I didn’t talk about the API or the actual distribution model. That is intentional, I’ll cover that in a future post.

Raven.Munin

Raven.Munin is the actual implementation of a low level managed storage for RavenDB. I split it out of the RavenDB project because I intend to make use of it in additional projects.

At its core, Munin provides high performance transactional, non relational, data store written completely in managed code. The main point in writing it was to support the managed storage in RavenDB, but it is going to be used for Raven MQ as well, and probably a bunch of other stuff as well. I’ll post about Raven MQ in the future, so don’t bother asking about it.

Let us look at a very simple API example. First, we need to define a database:

public class QueuesStorage : Database
{
    public QueuesStorage(IPersistentSource persistentSource) : base(persistentSource)
    {
        Messages = Add(new Table(key => key["MsgId"], "Messages")
        {
            {"ByQueueName", key => key.Value<string>("QueueName")},
            {"ByMsgId", key => new ComparableByteArray(key.Value<byte[]>("MsgId"))}
        });

        Details = Add(new Table("Details"));
    }

    public Table Details { get; set; }

    public Table Messages { get; set; }
}

This is a database with two tables, Messages and Details. The Messages table has a primary key of MsgId, and two secondary indexes by queue name and by message id. The Details table is sorted by the key itself.

It is important to understand one very important concept about Munin. Data stored in it is composed of two parts. The key and the data. It is easier to explain when you look at the API:

bool Remove(JToken key);
Put(JToken key, byte[] value);
ReadResult Read(JToken key);

public class ReadResult
{
    public int Size { get; set; }
    public long Position { get; set; }
    public JToken Key { get; set; }
    public Func<byte[]> Data { get; set; }
}

Munin doesn’t really care about the data, it just saves it. But the key is important. In the Details table case, the table would be sorted by the full key. In the Messages table case, things are different. We use the lambda to extract the primary key from the key for each item, and we use additional lambdas to extract secondary indexes. Munin can build secondary indexes only from the key, not from the value. It is only the secondary indexes that allow range queries, the PK allow only direct access, which is why we define both the primary key and a secondary index on MsgId.

Let us see how we can save a new message:

public void Enqueue(string queue, byte[]data)
{
    messages.Put(new JObject
    {
        {"MsgId", uuidGenerator.CreateSequentialUuid().ToByteArray()},
        {"QueueName", queue},
    }, data);
}

And now we want to read it:

public Message Dequeue(Guid after)
{
    var key = new JObject { { "MsgId", after.ToByteArray()} };
    var result = messages["ByMsgId"].SkipAfter(key).FirstOrDefault();
    if (result == null)
        return null;

    var readResult = messages.Read(result);

    return new Message
    {
        Id = new Guid(readResult.Key.Value<byte[]>("MsgId")),
        Queue = readResult.Key.Value<string>("Queue"),
        Data = readResult.Data(),
    };
}

We do a bunch of stuff here, we scan the secondary index for the appropriate value, then get it from the actual index, load it into a DTO and return it.

Transactions

Munin is fully transactional, and it follows an append only, MVCC, multi reader single writer mode. The previous methods are run in the context of:

using (queuesStroage.BeginTransaction())
{
    // use the storage
    queuesStroage.Commit();
}

Storage on disk

Munin can work with either a file or in memory (which makes unit testing it a breeze). It uses an append only model, so on the disk it looks like this:image

Each of the red rectangle represent a separate transaction.

In memory data

You might have noted that we don’t keep any complex data structures on the disk. This is because all the actual indexing for the data is done in memory. The data on the disk is used solely for building the index in memory. Do note that the actual values are not held, only the keys. That means that search Munin indexes is lightning fast, since we never touch the disk for the search itself.

The data is actually held in an immutable binary tree, which gives us the ability to do MVCC reads, without any locking.

Compaction

Because Munin is using the append only model, it require periodic compaction. It does so automatically in RavenDB, waiting for periods of inactivity to do so.

Summary

Munin is a low level api, not something that you are likely to use directly. And it was explicitly modeled to give me an interface similar in capability to what Esent gives me, but in purely managed code.

Please note that is is released under the same license as RavenDB, AGPL.

Those are the rules, even when you don’t like them

Originally posted at 11/4/2010

Recently I had an interesting support call for NHibernate. The problem was a bit complex when explained to me, but we got it simplified to something like:

When we have a component that contains a set, that component is not null, even when all the members are null.

The problem is actually an intersection of two separate rules in NHibernate:

  • When all the members of a component are null, the component itself will be null.
  • A set is never null, an empty set is still a valid instance of a set.

When you add a set to a component, you add something that is never null. Hence, the behavior of NHibernate in this case is also valid, since we have a non null member value, there will be an instance of that component.

The problem is that the users conceptually thought of the empty set as null as well. I had some hard time explaining that sets are never null, and that no, this isn’t a bug or unexpected combination of the two features. Both behave exactly as expected, and the intersection of both worked as expected. In fact, trying to make it not work in this fashion would introduced a lot of work, complexity and additional queries.