Ayende @ Rahien

Hi!
My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:

ayende@ayende.com

+972 52-548-6969

, @ Q c

Posts: 10 | Comments: 37

filter by tags archive

Performance optimizations: Rinse, Repeat, Repeat, Repeat

time to read 7 min | 1306 words

Originally posted at 1/18/2011

We got some reports that there was an O(N) issue with loading large number of documents from Raven.

I wrote the following test code:

var db = new DocumentDatabase(new RavenConfiguration
{
    DataDirectory = "Data"
});
Console.WriteLine("Ready");
Console.ReadLine();
while (true)
{
    var sp = Stopwatch.StartNew();

    db.Query("Raven/DocumentsByEntityName", new IndexQuery
    {
        Query = "Tag:Item",
        PageSize = 1024
    });

    Console.WriteLine(sp.ElapsedMilliseconds);
}

With 1,024 documents in the database, I could clearly see that most requests took in the order of 300 ms. Not content with that speed, I decided to dig deeper, and pulled out my trusty profiler (dotTrace, and no, I am not getting paid for this) and got this:

image

As should be clear, it seems like de-serializing the data from byte[] to a JObject instance is taking a lot of time (relatively speaking).

To be more precise, it takes 0.4 ms, to do two deserialization operations (for the following document):

{
    "cartId": 666,
    "otherStuff": "moohahaha",
    "itemList": [{
        "productId": 42,
        "name": "brimstone",
        "price": 6.66,
        "quantity": 2000
    },
    {
        "productId": 42,
        "name": "brimstone",
        "price": 6.66,
        "quantity": 2000
    },
    {
        "productId": 42,
        "name": "brimstone",
        "price": 6.66,
        "quantity": 2000
    },
    {
        "productId": 42,
        "name": "brimstone",
        "price": 6.66,
        "quantity": 2000
    },
    {
        "productId": 42,
        "name": "brimstone",
        "price": 6.66,
        "quantity": 2000
    },
    {
        "productId": 42,
        "name": "brimstone",
        "price": 6.66,
        "quantity": 2000
    },
    {
        "productId": 42,
        "name": "brimstone",
        "price": 6.66,
        "quantity": 2000
    },
    {
        "productId": 42,
        "name": "brimstone",
        "price": 6.66,
        "quantity": 2000
    }]
}

I found it quite surprising, to tell you the truth. I wrote the following test case to prove this:

static void Main()
{
    
    var st =
        "IQAAAAJSYXZlbi1FbnRpdHktTmFtZQAFAAAASXRlbQAAvgIAABJjYXJ0SWQAmgIAAAAAAAACb3RoZXJTdHVm"+
        "ZgAKAAAAbW9vaGFoYWhhAARpdGVtTGlzdACFAgAAAzAATQAAABJwcm9kdWN0SWQAKgAAAAAAAAACbmFtZQAKA"+
        "AAAYnJpbXN0b25lAAFwcmljZQCkcD0K16MaQBJxdWFudGl0eQDQBwAAAAAAAAADMQBNAAAAEnByb2R1Y3RJZAA"+
        "qAAAAAAAAAAJuYW1lAAoAAABicmltc3RvbmUAAXByaWNlAKRwPQrXoxpAEnF1YW50aXR5ANAHAAAAAAAAAAMy"+
        "AE0AAAAScHJvZHVjdElkACoAAAAAAAAAAm5hbWUACgAAAGJyaW1zdG9uZQABcHJpY2UApHA9CtejGkAScXVhbn"+
        "RpdHkA0AcAAAAAAAAAAzMATQAAABJwcm9kdWN0SWQAKgAAAAAAAAACbmFtZQAKAAAAYnJpbXN0b25lAAFwcmljZ"+
        "QCkcD0K16MaQBJxdWFudGl0eQDQBwAAAAAAAAADNABNAAAAEnByb2R1Y3RJZAAqAAAAAAAAAAJuYW1lAAoAAABi"+
        "cmltc3RvbmUAAXByaWNlAKRwPQrXoxpAEnF1YW50aXR5ANAHAAAAAAAAAAM1AE0AAAAScHJvZHVjdElkACoAAAAAA"+
        "AAAAm5hbWUACgAAAGJyaW1zdG9uZQABcHJpY2UApHA9CtejGkAScXVhbnRpdHkA0AcAAAAAAAAAAzYATQAAABJwcm9"+
        "kdWN0SWQAKgAAAAAAAAACbmFtZQAKAAAAYnJpbXN0b25lAAFwcmljZQCkcD0K16MaQBJxdWFudGl0eQDQBwAAAAAAAA"+
        "ADNwBNAAAAEnByb2R1Y3RJZAAqAAAAAAAAAAJuYW1lAAoAAABicmltc3RvbmUAAXByaWNlAKRwPQrXoxpAEnF1YW50a"+
        "XR5ANAHAAAAAAAAAAAA";
    var buffer = Convert.FromBase64String(st);

    while (true)
    {
        var sp = Stopwatch.StartNew();
        for (int i = 0; i < 1024; i++)
        {
            DoWork(buffer);
        }
        Console.WriteLine(sp.ElapsedMilliseconds);
    }
}

private static void DoWork(byte[] buffer)
{
    var ms = new MemoryStream(buffer);
    JObject.Load(new BsonReader(ms));
    JObject.Load(new BsonReader(ms));
}

On my machine, this run at around 70ms for each 1,204 run. In other words, it takes significantly less I would have thought. Roughly 0.06 ms.

Note: The first number (0.4 ms) is under the profiler while the second number (0.06ms) is outside the profiler. You can routinely see order of magnitude differences between running inside and outside the profiler!

So far, so good, but we can literally see that this is adding almost a 100 ms to the request processing. That is good, because it is fairly simple to fix.

What I did was introduce a cache inside the serialization pipeline that made the entire cost go away. Indeed, running the same code above showed much better performance, an average of 200 ms.

The next step is to figure out where is the next cost factor… For that, we use the profiler.

image

And… we can see the the cost of de-serializing went drastically down. Now the actual cost is just doing a search for the document by the key.

You might have noticed that those results are for Munin. I have run the same test results with Esent, with remarkable similarity in the overall performance.


Comments

Renaud Martinon

Hi Oren,

Thanks for sharing another interesting story from the trenches. Always warms one's heart to see someone applying actual software craftsmanship instead of slapping together a piece of quick & dirty cruft.

I'm afraid the number of developpers who have the motivation and/or skills and/or time allocated to do that kind of deep analysis without a terrible performance showstopper forcing them to do so (like 45 secondes instead of your 400ms) is depressingly low. Some of that is probably due to most projects not allocating enough resources to that kind of endeavours though.

Anyway, always nice to see how you do it, and hope you can inspire a few more to do it as well.

Comment preview

Comments have been closed on this topic.

FUTURE POSTS

  1. Production postmortem: The case of the memory eater and high load - 6 hours from now
  2. Production postmortem: The case of the lying configuration file - about one day from now
  3. Production postmortem: The industry at large - 2 days from now
  4. The insidious cost of allocations - 3 days from now
  5. Find the bug: The concurrent memory buster - 4 days from now

And 4 more posts are pending...

There are posts all the way to Sep 10, 2015

RECENT SERIES

  1. Find the bug (5):
    20 Apr 2011 - Why do I get a Null Reference Exception?
  2. Production postmortem (10):
    14 Aug 2015 - The case of the man in the middle
  3. What is new in RavenDB 3.5 (7):
    12 Aug 2015 - Monitoring support
  4. Career planning (6):
    24 Jul 2015 - The immortal choices aren't
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats