Raven MQ – Principles

time to read 5 min | 826 words

Originally posted at 11/9/2010

Raven MQ is a new project that I am working on. As you can guess from the name, this is a queuing system, but it is a queuing system with a few twists.  I already wrote a queuing system in the past (Rhino Queues), why write another one?

Raven MQ builds upon the experience in building Rhino Queues, but it also targets a different set of usage scenarios. Like Rhino Queues, Raven MQ can be xcopy deployed, but it is not usually used in a traditional point to point messaging system. Instead, Raven MQ is a queuing system for the web. What do I mean by that? Raven MQ has a different set of design decisions, focused on making some things that are traditionally expensive in queuing systems cheap:

  • Unlike in most queuing systems, queues are cheap. That allows you to create an unlimited amount of queues. Typical deployment of Raven MQ will have at least one queue per client.
  • Which leads to the next point, Raven MQ is designed to support literally thousands of clients.

The model isn’t the traditional queuing one you might be familiar with from MSMQ:

image

Instead, the model uses a central server to hold all the information:

image

 

The reasoning behind this is actually pretty simple. Unlike in traditional queuing systems, where we have a node of the queuing system running on each end point, Raven MQ makes the assumption that most of the clients connect to it are actually web clients, using JavaScript on the page or maybe Silverlight applications.

The decision to directly support those clients is what makes Raven MQ unique.

Transport models

Raven MQ offers two distinct models for transporting messages. The first is the traditional queue model, where each message can only be consumed by a single consumer. This is not a very interesting model.

A much more interesting model is the message stream. A message stream in Raven MQ is a set of messages sent to a particular queue. But unlike a queue, reading a message from the stream does not consume it. That means that multiple consumers can read the messages on the stream. Moreover, clients that arrive after the message was sent can still read the message (as long as its time to live is in effect).

Usage model

The previous section is probably hard to understand. As usual, an example will makes all the difference in the world.

Let us imagine that we are building a CRM system, and we are currently viewing a customer screen. At that point, we are subscribe to the following streams:

  • /streams/system/notifications – Global system notifications
  • /streams/customers/1234 – Updates about customer 1234
  • /streams/users/4321 – Updates about our logged on user

And the following queue:

  • /queues/mailboxes/1234 – Replies to our particular client

The idea is pretty simple, actually. When we read the customer data, we are loading it from the view model store, but we also need to be able to efficiently get updates about changes that happen to the customer when we are looking at it. We are doing that by subscribing to the appropriate stream. Another user who is also looking at the same user is also subscribed to the same stream. Even more importantly, a user that opened the customer after some changes have been made (but before they were written to the view model store) will also get those updates, and will be able to reconstruct the current state in an seamless manner.

This approach drastically simplifies the update problem in complex systems.

Why call them streams and not topics?

Topics are a routing mechanism, but with Raven MQ, streams aren’t used for routing. They are used to hold a set of messages, that is all. The problem with routing is that you can’t join up later and receive previously sent messages, and (much worse) you can’t really use routing on the web, because when you have potentially thousands of clients, all coming & going at will, you can’t setup a queue for each of them, it is too expensive.

The stream/notification model solve that problem rather neatly, even if I say so myself.

What I did not discussed?

Please note that I am discussing the system at a very high level right now. I didn’t talk about the API or the actual distribution model. That is intentional, I’ll cover that in a future post.