Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,546
|
Comments: 51,161
Privacy Policy · Terms
filter by tags archive
time to read 3 min | 409 words

There are two methods in StoreManagerController that we haven’t touched yet. Dealing with them is going to just a little bit different:

image

Can you figure out why?

The answer is that when we started, we decided that there really isn’t any reason to store artists as individual documents. They are just reference data, after all. Now, however, we need to reference them.

We could, of course, create a set of artists documents, at which point it would be very easy to port the code:

image

But I still think that artists don’t really exists in this model as an independent entity. So instead of going with this route, we are going to project them.

We define the “Arists” index using the following map/reduce linq queries:

// map 
from album in docs.Albums
select new { album.Artist.Id, album.Artist.Name }

// reduce 
from artist in results
group artist by new { artist.Id, artist.Name } into g
select new { g.Key.Id, g.Key.Name }

If you’ll look carefully, you’ll notice that this is essentially doing a distinct over all the artists across all albums.

And that means that we can now write the code for those two methods like this;

image

There is one very important things to remember here: In Raven’s queries are cheap, because Raven allows you to query indexes only, and those indexes are built in the background, making queries tend to be very fast.

That changes the way that you think about designing you system and data model. You want to move a lot of your processing to indexes and queries upon those indexes, because it tends to be cheaper all around.

time to read 8 min | 1454 words

image I keep trying to work on the replication bundle for Raven, but I keep getting distracted with more interesting stuff.

In this case, I kept coming back to several discussions that I had with people who want to use Raven for storing events, and were thinking about how to go from a stream of events to a complete aggregate. I kept thinking that Raven should already be able to handle that. And indeed it can, quite easily, it turns out.

Raven is already capable of running operations over a stream of document to produce a value, to go from there to event stream producing an aggregate is easy. The only problem was that we needed to support external views. That was easy enough to do, so let me show what we have now.

Let us assume that we have the events shown on the right stored in Raven, as you can see, this is a stream of events for a shopping cart. What we want to have is to go from there to an actual shopping cart.

We define the following view:

image

I am showing the class diagram here to show you all the types that are involved here. Note that ShoppingCart has AddToCart and RemoveFromCart method, which has the typical implementation.

Now, let us look at the actual view code:

    [DisplayName("Aggregates/ShoppingCart")]
    public class ShoppingCartEventsToShopingCart : AbstractViewGenerator
    {
        public ShoppingCartEventsToShopingCart()
        {
            MapDefinition = docs => docs.Where(document => document.For == "ShoppingCart");
            GroupByExtraction = source => source.ShoppingCartId;
            ReduceDefinition = Reduce;

            Indexes.Add("Id", FieldIndexing.NotAnalyzed);
            Indexes.Add("Aggregate", FieldIndexing.No);
        }

        private static IEnumerable<object> Reduce(IEnumerable<dynamic> source)
        {
            foreach (var events in source
                .GroupBy(@event => @event.ShoppingCartId))
            {
                var cart = new ShoppingCart { Id = events.Key };
                foreach (var @event in events.OrderBy(x => x.Timestamp))
                {
                    switch ((string)@event.Type)
                    {
                        case "Create":
                            cart.Customer = new ShoppingCartCustomer
                            {
                                Id = @event.CustomerId,
                                Name = @event.CustomerName
                            };
                            break;
                        case "Add":
                            cart.AddToCart(@event.ProductId, @event.ProductName, (decimal)@event.Price);
                            break;
                        case "Remove":
                            cart.RemoveFromCart(@event.ProductId);
                            break;
                    }
                }
                yield return new
                {
                    cart.Id,
                    Aggregate = JObject.FromObject(cart)
                };
            }
        }
}

We are doing several interesting things happening in the constructor:

  • The display name is the name of the index.
  • In the constructor, we define the map part as filtering for events for the shopping cart.
  • We will create a shopping cart per shopping cart id, so we specify the group by extraction method. Raven will use that to optimize updates.
  • Note the indexes definition, we want to id to be stored as as a primary key, and the aggregate data to be stored, not analyzed for searching.

Now, let us talk about the interesting bits, the Reduce function.

That function should be pretty easy to follow, I think. We are getting a stream of events, grouping them by their shopping cart id. Then, for each shopping cart, we sort the events by date, and proceed to build the aggregate.

Finally, we return the data so Raven will store it in the index.

The result, by the way, is this:

image

I think this is cool.

Using this approach, Raven will automatically keep the aggregate definition up to date with the event streams coming on. Furthermore, that aggregate will only be computed when a change happen, so accessing it is very cheap.

Finally, if we have a storm of events on a particular shopping cart, we can choose whatever to wait and see it in its most version, or get a potentially stale view of it really fast.

time to read 2 min | 348 words

The final part of the port of the MVC Music Store to Raven is the administration section, implemented in StoreManagerController. I am going to show comparisons of all the methods where the port doesn’t offer anything new, and then focus on an interesting conceptual difference between the implementations.

image image

Please note that the main reason that the Raven code is so much shorter is that I threw away the nonsensical error handling (or lack thereof).

  image   image

Again, throwing away the error handling that isn’t made a lot of the difference in the code.

image image

Now we get to an interesting difference. The old code will delete orders if they include the deleted album. Raven’s code does no such thing.

It is important to understand that there is no such thing as referential integrity in Raven (or document databases in general). This can be a plus or a minus, but in this case, we are turning that into a plus, because we can delete an album without losing orders.  I don’t know about you, but I like the idea of keeping the orders around. :-)

A bit more formally, documents in Raven are independent, they aren’t affected by changes to other documents.

There are two more methods to discuss with regards to the StoreManagerController, but I’ll discuss them in my next post.

time to read 4 min | 715 words

I just run into an extremely strange bug with the System.Transactions API. It appears that under certain circumstances, you can exit the transaction scope before it has finished committing. Here is the code to reproduce this:

public class EnlistmentTracking : IEnlistmentNotification
{
    public static int EnlistmentCounts;

    public EnlistmentTracking()
    {
        Interlocked.Increment(ref EnlistmentCounts);
    }

    public void Prepare(PreparingEnlistment preparingEnlistment)
    {
        preparingEnlistment.Prepared();
    }

    public void Commit(Enlistment enlistment)
    {
        Interlocked.Decrement(ref EnlistmentCounts);
        enlistment.Done();
    }

    public void Rollback(Enlistment enlistment)
    {
        Interlocked.Decrement(ref EnlistmentCounts);
        enlistment.Done();
    }

    public void InDoubt(Enlistment enlistment)
    {
        Interlocked.Decrement(ref EnlistmentCounts);
        enlistment.Done();
    }
}    

This class simply tracks the number of instances that it has. It does no blocking and operates entirely in memory.

Here is the code to to show the problem:

var newGuid = Guid.NewGuid();
for (int i = 0; i < 100; i++)
{
    using(var tx = new TransactionScope())
    {
        Transaction.Current.EnlistDurable(newGuid, new EnlistmentTracking(), EnlistmentOptions.None);
        Transaction.Current.EnlistDurable(newGuid, new EnlistmentTracking(), EnlistmentOptions.None);

        tx.Complete();
    }

    Console.WriteLine(Thread.VolatileRead(ref EnlistmentTracking.EnlistmentCounts));
}

This just run in a loop, creating two instances of the enlistment (forcing it to be distributed transaction), and commit the transaction. After the transaction is completed, we read how many enlistments are still alive. Surprisingly, I keep getting non zero values here.

The really freaky part is that if I’ll put a small wait there, I’ll get zero value back, which is what I would expect. This is on .NET 4.0, by the way.

Let us look at the documentation for Dispose:

This method is synchronous and blocks until the transaction has been committed or aborted.

Hmm… that is not what I am seeing here.

Any idea what is going on?

From what I see here, I would say that it is only waiting until Prepare is called, not until Commit / Rollback is called. The way I implemented things, prepare does all the actual work, but it is the commit that switch things around so those changes are visible. The result of this behavior is that until Commit has been called, the transaction has not been really committed.

It appears that what I am supposed to do is:

  • On prepare, commit the transaction, but keep around the data required to roll it back.
  • On commit, cleanup everything that is required to do the cleanup.
  • On rollback, use the cleanup data to rollback the transaction.
  • On doubt, dance a merry jig and then throw yourself off the bridge.

But that is based on the behavior of the code, not on what I am seeing on the docs, and it is seems wrong.

time to read 2 min | 272 words

The checkout process in the MVC Music Store is composed of two parts, adding address & payment options and completing the order.

The old code for address & payment is on the left, the new on the right.

image image

As you can see, they are quite similar. Raven’s code isn’t complete yet, though.

If you’ll recall, we stated that we are going to store the CountSold property inside the Album document, to allow us to easily sort by that count. We now need to write that logic, I put it directly after the call to CreateOrder:

image

It is important to note that we are loading all the albums document in a single query. And when we save, Raven is going to make a single (batched) call to the server.

And now, merely to completion sake (pun intended) let us look at the Complete method:

image image

I think by now you can tell what is going on in each system. The next post will cover the administration section.

time to read 3 min | 600 words

The ShoppingCartController is heavily affected by the changes we made to the ShoppingCart. Let us look at those changes, starting from Index():

image image

On the left, we have the original version. You can see that it executes two different queries to process this order, the Raven version, however, is executing only a single query, in the FindShoppingCart method:

image

This just implements the logic of loading the cart from Raven or creating a new cart (with the specified shopping cart id), note that we don’t save the new cart to the database here, merely associate the new cart with the session. There is no need to save, since it contains no meaningful data at this point. When we will call SaveChanges(), the new cart will be send to Raven for storage.

Let us look at the AddToCart action now:

image  image

On the left, you can see the old version, and on the right, you see the Raven version. They are pretty similar, except that in the Raven case, the shopping cart’s AddToCart is concerned solely with adding a new item to the cart or incrementing the quantity of an existing item. There is absolutely no data access in the Raven’s version of ShoppingCart.AddToCart.

One major difference is that the Raven approach is calling the session.SaveChanges() in the action code. The reason for that is simple, it is the proper place for this, as the calling code, it is responsible for the environment, including saving when needed.

image image

Raven’s code is pretty easy to follow here, I think. There is just one thing that you should note, the last line is pretty strange id.Split ? Why do we do that?

The reason for that is that Raven uses id that looks like this: “albums/616” and the DeleteId is used by the calling javascript code to find an element by its id. An element can’t contain a ‘/’ in it, so we strip it away and only send the number part of the id. This is safe to do since we only deal with albums here.

image image

Again, this is about as simple as you can make it, so I’ll note only that Raven’s approach can benefit from the unit of work cache, and the old code approach can’t.

In my next post, we will deal with the order process.

time to read 3 min | 419 words

I like reading, specifically, science fiction and fantasy has always had a huge pull on my imagination. That let me to a great series of books by Ilona Andrews, the latest of which has just come out.

I live in Israel, and shipping time & cost from the US is quite prohibitive. At times, I have paid four or five times the cost of the book just to be able to get the bloody thing. So the rise of the Kindle filled me with a great sense of relief. I got a Kindle and started reading on that (I love it). I estimate that I read over 200 books on the thing already.

One thing that I really like is the ability to pre order a book, get it on my Kindle when it is out, and start reading immediately. But recently I have gotten several notifications about books that I really wanted to read. Here is one:

We're writing to let you know that we've canceled your order for Magic Bleeds because it will not be released by the publisher in Kindle format on  Tuesday, May 25, 2071 as previously expected. We don't yet have a date for when this item will be released for Kindle. We will send you an email notifying you when the Kindle edition becomes available.

Okay, annoying. Let us hit audible.com and see if they have the audio book version available. It appears that this is not the case… but I can buy the CD version, which requires physical shipping, from Amazon.

Where does it leaves me?

There seems to be no way for me to get the book in a reasonable timeframe/cost.

Wait, let me rephrase that. There appears to be no legal way. While I have no direct knowledge of that, I am guessing that if I hit a torrent site and try to search for the book, I would not only find the book, but will be able to get the freaking thing faster than going with the legal download route.

It is actually quite simple. I would really like to give you some money, if you make it harder for me to give you money, you won’t get my money.

This decision is stupid, moronic, idiotic, senseless, irritating, annoying and in general lack all sense.

~Ayende the annoyed

time to read 6 min | 1121 words

image

The ShoppingCart class in the MVC Music Store is my current punching bug, I really don’t like it.

You can see how it looks on the image on the right. The problem is that this code mixes two different responsibilities:

  • Operations about a shopping cart:
    • GetCart
    • GetCartId
    • GetCartItems
    • GetCount
  • Operations of a shopping cart:
    • AddToCart
    • CreateOrder
    • EmptyCart
    • MigrateCart
    • RemoveFromCart

You might have noticed that all the operations about a shopping cart are get operations. All the operations of the cart are things that belong to the cart, it is the cart’s business logic and reason for being. The Get operations don’t belong to the cart, they belong in some other object that manages instances of carts.

In most applications, we would call this object a Repository. I am not sure that we really need one here. Looking at the Get methods, most of them are here because of the decision to only store cart line items, which requires us to issue explicit queries to get the data.

With Raven, we would follow a different model, which means that the only thing we are likely to need is GetCart() and maybe GetCartId().

Here is how a typical cart document will look like as a document:

image

And as an entity in our application:

 image

The GetTotal method was replaced with a Total property.  Until the GetTotal method, with issued a query to the database, this property operates solely in memory. This is another major difference with Raven vs. OR/M solution is that Raven doesn’t do lazy loading. This is by design, since document dbs data models rarely need to traverse data outside their own document. Traversing the document from Raven cannot force lazy loading or result in the dreaded SELECT N+1 issues.

And now let us deal with the operations about a cart. The most important ones are GetCartId and GetCart. I think that those methods has no place there. I created a new class, ShoppingCartFinder, which looks like this:

image

Note that we don’t expose GetCartId anymore, this is an internal detail that shouldn’t be seen by clients of this class. We do need to support setting the cart id, because we also support cart migrations (when an anonymous users logs in). We don’t need any of the other methods, so I removed them.

Let us go over the operations of the cart.

image image

The method on the left is the original code, and on the right you can see Raven’s code. The Raven code operates completely in memory, and in totally persistence ignorance. The old code is deals explicitly with persistence. This isn’t that much of a problem, except that this is the wrong level to deal with persistence issues.

image image

 

Going over RemoveFromCart, you can see that it shrunk significantly in size, and again, it is an in memory operation only. EmptyCart isn’t implemented in the Raven version, since it is just a Lines.Clear()

It is interesting to note that EmptyCart in the old implementation would result in N queries, where N is the number of items in the cart, while with Raven, this will result in 1 query.

imageimage

I don’t think that there is much to say here, except that the old code would execute N*2 queries, while Raven’s code will still execute 1 query :-)

MigrateCart is interesting, because the implementation is drastically different:

image image

With the old code, we update all the items in the cart, one at a time. With Raven, we do something drastically different. The Shopping Cart Id is the document key, so given the shopping cart id (which is the user name or stored in the session), we can load the shopping cart in using a Load (by primary key, to equate to the relational mindset). Migrating a cart is a simple enough operation, all you have to do is change the key. Since Raven doesn’t allow renames, we do it with a Delete/Store, which are executed inside a single transaction.

The calling code for MigrateCart looks like this:

image 

Note that SaveChanges is atomic and transactional, so this has the same effect as issuing a rename.

And that is it for the shopping cart, in my next post, I’ll discuss the ShoppingCartController which uses this class.

time to read 5 min | 847 words

This post is copied (with permission) from Roy Osherove. I don’t often do things like that but Roy’s post has pushed a lot of red buttons.

Let me just hand you over to Roy:

A month or so ago, Microsoft Israel started sending out emails to its partners and registered event users to “Save the date!” – Micraoft Teched Israel is coming, and it’s going to be this november!

“Great news” I thought to myself. I’d been to a couple of the MS teched events, as a speaker and as an attendee, and it was lovely and professionally done. Israel is an amazing place for technology and development and TechEd hosted some big names in the world of MS software.

A couple of weeks ago, I was shocked to hear from a couple of people that Microsoft Israel plans to only accept non-MS teched speakers, only from sponsors of the event. That means that according to the amount that you have paid, you get to insert one or more of your own selected speakers as part of teched.

I’ve spent the past couple of weeks trying to gather more evidence of this, and have gotten some input from within MS about this information. It looks like that is indeed the case, though no MS rep. was prepared to answer any email I had publicly. If they approach me now I’d be happy to print their response.

What does this mean?

If this is true, it means that Microsoft Israel is making a grave mistake –

  • They are diluting the quality of the speakers for pure money factors. That means, that as a teched attendee, who paid good money, you might be sitting down to watch nothing more that a bunch of infomercials, or sub-standard speakers – since speakers are no longer selected on quality or interest in their topic.
  • They are turning the conference from a learning event to a commercial driven event
  • They are closing off the stage to the community of speakers who may not be associated with any organization  willing to be a sponsor
  • They are losing speakers (such as myself) who will not want to be part of such an event. (yes – even if my company ends up sponsoring the event, I will not take part in it, Sorry Eli!)
  • They are saying “F&$K you” to the community of MVPs who should be the people to be approached first about technical talks (my guess is many MVPs wouldn’t want to talk at an event driven that way anyway )

I do hope this ends up not being true, but it looks like it is. MS Israel had already done such a thing with the Developer Days event previouly held in Israel – only sponsors were allowed to insert speakers into the event.

If this turns out to be true I would urge the MS community in Israel to NOT TAKE PART AT THIS EVENT in any form (attendee, speaker, sponsor or otherwise). by taking part, you will be telling MS Israel it’s OK to piss all over the community that they are quietly suffocating anyway.

The MVP case

MS Israel has managed to screw the MVP program as well. MS MVPs (I’m one) have had a tough time here in Israel the past couple of years. ever since yosi taguri left the blue badge ranks, there was not real community leader left. Whoever runs things right now has their eyes and minds set elsewhere, with the software MVP community far from mind and heart. No special MVP events (except a couple of small ones this year). No real MVP leadership happens here, with the MVP MEA lead (Ruari) being on a remote line, is not really what’s needed.

“MVP? What’s that?” I’m sure many MS Israel employees would say. Exactly my point.

Last word

I’ve been disappointed by the MS machine for a while now, but their slowness to realize what real community means in the past couple of years really turns me off. Maybe it’s time to move on. Maybe I shouldn’t be chasing people at MS Israel begging for a room to host the Agile Israel user group. Maybe it’s time to say a big bye bye and start looking at a life a bit more disconnected from that giant. I hear the people at Google are pretty Agile!

And now back to me. I had more discussions in the last two years with Microsoft UK than with Microsoft Israel. I think it says it all, and I am an Israeli MVP who spends most of his time in Israel.

FUTURE POSTS

  1. Partial writes, IO_Uring and safety - about one day from now
  2. Configuration values & Escape hatches - 5 days from now
  3. What happens when a sparse file allocation fails? - 7 days from now
  4. NTFS has an emergency stash of disk space - 9 days from now
  5. Challenge: Giving file system developer ulcer - 12 days from now

And 4 more posts are pending...

There are posts all the way to Feb 17, 2025

RECENT SERIES

  1. Challenge (77):
    20 Jan 2025 - What does this code do?
  2. Answer (13):
    22 Jan 2025 - What does this code do?
  3. Production post-mortem (2):
    17 Jan 2025 - Inspecting ourselves to death
  4. Performance discovery (2):
    10 Jan 2025 - IOPS vs. IOPS
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}