# Ayende @ Rahien

Hi!
My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:

ayende@ayende.com

+972 52-548-6969

Posts: 6,026 | Comments: 44,842

## Avoid where in a reduce clause

time to read 4 min | 685 words

We got a customer question about a map/reduce index that produced the wrong results. The problem was a problem between the conceptual model and the actual model of how Map/Reduce actually works.

Let us take the following silly example. We want to find all the animal owner’s that have more than a single animal. We define an index like so:

```// map
from a in docs.Animals
select new { a.Owner, Names = new[]{a.Name} }

// reduce
from r in results
group r by r.Owner into g
where g.Sum(x=>x.Names.Length) > 1
select new { Owner = g.Key, Names = g.SelectMany(x=>x.Names) }```

And here is our input:

```{ "Owner": "users/1", "Name": "Arava" }    // animals/1
{ "Owner": "users/1", "Name": "Oscar" }    // animals/2
{ "Owner": "users/1", "Name": "Phoebe" }   // animals/3
```

What would be the output of this index?

At first glance, you might guess that it would be:

`{ "Owner": "users/1", "Names": ["Arava", "Oscar", "Phoebe" ] }`

But you would be wrong. The actual output of this index… It is nothing. This index actually have no output.

But why?

To answer that, let us ask the following question. What would be the output for the following input?

`{ "Owner": "users/1", "Name": "Arava" } // animals/1`

That would be nothing, because it would be filtered by the where in the reduce clause. This is the underlying reasoning why this index has no output.

If we feed it the input one document at a time, it has no output. It is only if we give it all the data upfront that it has any output. But that isn’t how Map/Reduce works with RavenDB. Map/Reduce is incremental and recursive. Which means that we can (and do) run it on individual documents or blocks of documents independently. In order to ensure that, we actually always run the reduce function on the output of each individual document’s map result.

That, in turn, means that the index above has no output.

To write this index properly, I would have to do this:

```// map
from a in docs.Animals
select new { a.Owner, Names = new[]{a.Name}, Count = 1 }

// reduce
from r in results
group r by r.Owner into g
select new { Owner = g.Key, Names = g.SelectMany(x=>x.Names), Count = g.Sum(x=>x.Count) }```

And do the filter of Count > 1 in the query itself.

Why not disable the whole where clause in reduce?

I don't see any reason to filter things in reduce itself (what couldn't be filtered in map).

Derek, Because there are actual real scenarios where you want to do that. We don't want to block them at this time.

#### Comment preview

Comments have been closed on this topic.

#### FUTURE POSTS

No future posts left, oh my!

#### RECENT SERIES

1. Technical observations from my wife (3):
13 Nov 2015 - Production issues
2. Production postmortem (13):
13 Nov 2015 - The case of the “it is slow on that machine (only)”
3. Speaking (5):
09 Nov 2015 - Community talk in Kiev, Ukraine–What does it take to be a good developer
4. Find the bug (5):
11 Sep 2015 - The concurrent memory buster
5. Buffer allocation strategies (3):
09 Sep 2015 - Bad usage patterns
View all series