Ayende @ Rahien

It's a girl

Designing a document database: Aggregation

I said that I would speak a bit about aggregations. On the face of it, aggregation looks simple, really simple. Continuing the same thread of design from before, we can have:

image

The problem is that while this is really nice, it doesn’t really work.

The problem is that using this approach, we are going to have to recalculate the view for the entire document set that we have, a potentially very expensive operation. Now, technically I can solve the problem by rewriting the Linq statement. The problem is that it wouldn’t really work. While it is possible to do so, it wouldn’t really work because the following code assume that it knows all the state, and there is no way to regenerate that state in an incremental fashion.

Let us try a better approach:

image

Thanks for Alex Yakunin, for helping me simplify this.

What do we have now? We split the problem into two sections, the Map and the Reduce. Note that to simplify things, map and reduce must return objects in the same shape. That means that we don’t need an explicit re-reduce phase.

That is much easier to reason about, and it allow us to perform aggregation in a very easy manner, allowing us to do aggregation in a manner that is simple to partition. I am probably going to have another post regarding the actual details of handling aggregations.

Comments

Nathaniel Neitzke
03/13/2009 07:13 PM by
Nathaniel Neitzke

I would also take a look at a merge step. I know this is something I am currently looking into (map-reduce-merge).
portal.acm.org/citation.cfm?doid=1247480.1247602

I think using LINQ is an interesting choice as you mentioned before because of the possibility of tapping into PLINQ. Will have to think about this one.

configurator
03/13/2009 07:44 PM by
configurator

So the point is that you generate a map that looks as if it is grouped, and then you gradually reduce it into a smaller number of large groups?

Ayende Rahien
03/14/2009 12:58 AM by
Ayende Rahien

Nathaniel,

This requires registration, what is "merge" stage?

Ayende Rahien
03/14/2009 12:58 AM by
Ayende Rahien

configurator

Yes, that is the idea, see next installment for the exact process.

Comments have been closed on this topic.