Ayende @ Rahien

Refunds available at head office

Answer: Stopping the leaks

Originally posted at 4/19/2011

Yesterday I posted the following challenge:

Given the following API, can you think of a way that would prevent memory leaks?

public interface IBufferPool
{
    byte[] TakeBuffer(int size);
    void ReturnBuffer(byte[] buffer);
}

The problem with having something like this is that forgetting to return the buffer is going to cause a memory leak. Instead of having that I would like to have the application stop if a buffer is leaked. Leaked means that no one is referencing this buffer but it wasn’t returned to the pool.

What I would really like is that when running in debug mode, leaking a buffer would stop the entire application and tell me:

  • That a buffer was leaked.
  • What was the stack trace that allocated that buffer.

Let us take a look at how we are going about implementing this, shall we? I am going to defer the actual implementation of the buffer pool to System.ServiceModel.Channels.BufferManager and focus on providing the anti leak features. The result is that this code:

IBufferPool pool = new BufferPool(1024*512, 1024);

var buffer = pool.TakeBuffer(512);
GC.WaitForPendingFinalizers(); // nothing here

pool.ReturnBuffer(buffer);
buffer = null;
GC.WaitForPendingFinalizers(); // nothing here, we released the memory properly

pool.TakeBuffer(512); // take and discard a buffer without returning to the pool
GC.WaitForPendingFinalizers(); // failure!

Will result in the following error:

Unhandled Exception: System.InvalidOperationException: A buffer was leaked. Initial allocation:
   at ConsoleApplication1.BufferPool.BufferTracker.TrackAllocation() in IBufferPool.cs:line 22
   at ConsoleApplication1.BufferPool.TakeBuffer(Int32 size) in IBufferPool.cs:line 60
   at ConsoleApplication1.Program.Main(String[] args) in Program.cs:line 21

And now for the implementation:

public class BufferPool : IBufferPool
{
    public class BufferTracker
    {
        private StackTrace stackTrace;

        public void TrackAllocation()
        {
            stackTrace = new StackTrace(true);
            GC.ReRegisterForFinalize(this);
        }

        public void Discard()
        {
            stackTrace = null;
            GC.SuppressFinalize(this);
        }

        ~BufferTracker()
        {
            if (stackTrace == null)
                return;

            throw new InvalidOperationException(
                "A buffer was leaked. Initial allocation:" + Environment.NewLine + stackTrace
                );
        }
    }

    private readonly BufferManager bufferManager;
    private ConditionalWeakTable<byte[], BufferTracker> trackLeakedBuffers = new ConditionalWeakTable<byte[], BufferTracker>();

    public BufferPool(long maxBufferPoolSize, int maxBufferSize)
    {
        bufferManager = BufferManager.CreateBufferManager(maxBufferPoolSize, maxBufferSize);
    }

    public void Dispose()
    {
        bufferManager.Clear();
        // note that disposing the pool before returning all of the buffers will cause a crash
    }

    public byte[] TakeBuffer(int size)
    {
        var buffer = bufferManager.TakeBuffer(size);
        trackLeakedBuffers.GetOrCreateValue(buffer).TrackAllocation();
        return buffer;
    }

    public void ReturnBuffer(byte[] buffer)
    {
        BufferTracker value;
        if(trackLeakedBuffers.TryGetValue(buffer, out value))
        {
            value.Discard();
        }
        bufferManager.ReturnBuffer(buffer);
    }
}

As you can see, utilizing ConditionalWeakTable is quite powerful, since it allows us to support a lot of really advanced scenarios in a fairly simple ways.

Challenge: Stop the leaks

Given the following API, can you think of a way that would prevent memory leaks?

public interface IBufferPool
{
    byte[] TakeBuffer(int size);
    void ReturnBuffer(byte[] buffer);
}

The problem with having something like this is that forgetting to return the buffer is going to cause a memory leak. Instead of having that I would like to have the application stop if a buffer is leaked. Leaked means that no one is referencing this buffer but it wasn’t returned to the pool.

What I would really like is that when running in debug mode, leaking a buffer would stop the entire application and tell me:

  • That a buffer was leaked.
  • What was the stack trace that allocated that buffer.

A solution will be posted tomorrow.

An elegant ThreadLocal for Silverlight

One of the annoying bits about Silverlight is how much of what I consider core API isn’t there. For example, concurrent collections or thread local.

I didn’t have the time to implement concurrent collections properly, but thread local isn’t really that hard. Here is a simple implementation:

public class ThreadLocal<T>
{
    private readonly Func<T> valueCreator;

    public class Holder
     {
         public T Val;
     }

    [ThreadStatic]
    private static ConditionalWeakTable<object, Holder> _state;

    public ThreadLocal():this(() => default(T))
    {
    }

    public ThreadLocal(Func<T> valueCreator)
    {
        this.valueCreator = valueCreator;
    }

    public T Value
    {
        get
        {
            Holder value;
            if (_state == null || _state.TryGetValue(this, out value) == false)
            {
                var val = valueCreator();
                Value = val;
                return val;
            }
            return value.Val;
        }
        set
        {
            if (_state == null)
                _state = new ConditionalWeakTable<object, Holder>();
            var holder = _state.GetOrCreateValue(this);
            holder.Val = value;
        }
    }
}

The fun part, it satisfies the entire contract, but I am willing to bet it is going to cause some head scratching.

Raven Situational Awareness

There is a whole set of features that require collaboration from a set of severs. For example, when talking about auto scale scenarios, you really want the servers to figure things out on their own, without needing administrators to hold their hands and murmur sweet nothings at 3 AM.

We needed this feature in Raven DB, Raven MQ and probably in Raven FS, so I sat down and thought about what is actually needed and whatever I could package that in a re-usable form. I am on a roll for the last few days, and something that I estimated would take a week or two took me about six hours, all told.

At any rate, I realized that the important parts of this feature set is the ability to detect siblings on the same network, being able to detect failure of those siblings and the ability to dynamically select the master node.  The code is available here: https://github.com/hibernating-rhinos/Raven.SituationalAwareness under the AGPL license. If you want to use this code commercially, please contact me for commercial licensing arrangements.

Let us see what is actually involved here:

var presence = new Presence("clusters/commerce", new Dictionary<string, string>
{
    {"RavenDB-Endpoint", new UriBuilder("http", Environment.MachineName, 8080).Uri.ToString()}
}, TimeSpan.FromSeconds(3));
presence.TopologyChanged += (sender, nodeMetadata) =>
{
    switch (nodeMetadata.ChangeType)
    {
        case TopologyChangeType.MasterSelected:
            Console.WriteLine("Master selected {0}", nodeMetadata.Uri);
            break;
        case TopologyChangeType.Discovered:
            Console.WriteLine("Found {0}", nodeMetadata.Uri);
            break;
        case TopologyChangeType.Gone:
            Console.WriteLine("Oh no, {0} is gone!", nodeMetadata.Uri);
            break;
        default:
            throw new ArgumentOutOfRangeException();
    }
};
presence.Start();

As you can see, we are talking about a single class that is exposed to your code. You need to provide the cluster name, this allows us to run multiple clusters on the same network without conflicts. (For example, in the code above, we have a set of servers for the commerce service, and another for the billing service, etc). Each node also exposes metadata to the entire cluster. In the code above, we share the endpoint for our RavenDB endpoint.  The TimeSpan variable determines the heartbeat frequency for the cluster (how often it would check for failing nodes).

We have a single event that we can subscribe to, which let us know about changes in the system topology. Discovered and Gone are pretty self explanatory, I think. But MasterSelected is more interesting.

After automatically discovering all the siblings on the network, Raven Situation Awareness will use the Paxos algorithm to decide who should be the master. The MasterSelected event happens when a quorum of the nodes select a master. You can then proceed with your own logic based on that. If the master will fail, the nodes will convene again and the quorum will select a new master.

With the network topology detection and the master selection out of the way, (and all of that with due consideration for failure conditions) the task of actually implementing a distributed server system just became significantly easier.

Why is this not thread safe?

Originally posted at 4/15/2011

Take a look at the following piece of code:

if(items.ContainsKey(key) == false)
{
    lock(items)
    {
        if(items.ContainsKey(key) == false)
        {
            items.Add(key, val);
        }
    }
}

The answer is quite simple, and it is located directly in the docs:

image

Yes, on the face of it, this is safe code, in the sense that we will never get DuplicateKeyException. But the implementation of ContainsKey() isn’t safe to run when Add() is also executing.

The actual behavior depends on the implementation, but it is easy enough to imagine a scenario where invariants that ContainsKey() relies on are broken for the duration of the Add() call.

More on RavenDB structure

Originally posted at 4/14/2011

The following diagrams were generated via Visual Studio (Architecture > Generate Dependency Graph).

I have shown before the assembly dependency graph for RavenDB, it has changed a bit since then (note the Raven.Json addition), but it is still quite a nice graph:

image

The problem was that for backward compatibility reasons, the namespaces weren’t nearly as ordered. Mostly because we moved things around in the assemblies but couldn’t change the associated namespaces.

We were going to experience a breaking changes anyway, because of Raven.Json, so I took the time to clean things up properly:

image

Yes, I admit, I am addicted to (mostly) straight lines and simple directed graphs :-)

RavenDB: Let us write our own JSON Parser, NOT

One accusation that has been leveled at me often is that I keep writing my own implementation of Xyz (where Xyz is just about anything). The main problem is that I can get overboard with that, but for the most part, I think that I managed to strike the right balance. Wherever possible, I re-use existing, but when I run into problems that are easier to solve by creating my own solution, I would go with that.

A case in point is the JSON parser inside RavenDB. From the get go, I used Newtonsoft.Json.dll. There wasn’t much to think of, this is the default implementation from my point of view. And indeed, it has been an extremely fine choice. It is a rich library, it is available for .NET 3.5, 4.0 & Silverlight and it meant that I had opened up a lot of extensibility for RavenDB users.

Overall, I am very happy. Except… there was just one problem, with large JSON documents, the library showed some performance issues. In particular, a 3 MB JSON file took almost half a second to parse. That was… annoying. Admittedly, most documents tends to be smaller than that, but it also reflected on overall performance when batching, querying, etc. When you are querying, you are also building large json documents (a single document that contains a list of results, for example), so that problem was quite pervasive for us.

I set out to profile things, and discovered that the actual cost wasn’t in the JSON parsing itself, that part was quite efficient. The costly part was actually in building the JSON DOM (JObject, JArray, etc). When people usually think about JSON serialization performance, they generally think about the perf from and to .NET objects. The overriding cost in that sort of serialization is actually how fast you can call the setters on the objects. Indeed, when looking at perf metrics on the subject, most of the comparisons were concentrated on that aspect almost exclusively.

That make sense, since for the most part, that is how people use it. But for RavenDB, we are using JSON DOM for pretty much everything. This is how we are representing a document, after all, and that idea is pretty central to a document database.

Before setting out to write our own, I looked at other options.

ServiceStack.Json - that one was problematic for three reasons:

  • It wasn’t really nearly as rich in terms of API and functionality.
  • It was focused purely on reading to and from .NET objects, with no JSON DOM supported.
  • The only input format it had was a string.

The last one deserves a bit of explanation. We cannot afford to use a JSON implementation that accepts a string as input, because that JSON object we are reading may be arbitrarily large. Using a string means that we have to allocate all of that information up front. Using a stream, however, means that we can allocate far less information and reduce our overall memory consumption.

System.Json – that one had one major problem:

  • Only available on Silverlight, FAIL!

I literally didn’t even bother to try anything else with it. Other stuff we have looked on had those issues or similar as well, mostly, the problem was no JSON DOM available.

That sucked. I was seriously not looking to writing my own JSON Parser, especially since I was going to add all the bells & whistles of the Newtonsoft.Json. :-(

Wait, I can hear you say, the project is open source, why not just fix the performance problem? Well, we have looked into that as well.

The actual problem is pretty much at the core of how the JSON DOM is implemented in the library. All of the JSON DOM are basically linked lists, and all operations on the DOM are O(N). With large documents, that really starts to hurt. We looked into what it would take to modify that, but it turned out that it would have to be a breaking change (which pretty much killed the notion that it would be accepted by the project) or a very expensive change. That is especially true since the JSON DOM is quite rich in functionality (from dynamic support to INotifyPropertyChanged to serialization to… well, you get the point).

Then I thought about something else, can we create our own JSON DOM, but rely on Newtonsoft.Json to fill it up for us? As it turned out, we could! So we basically took the existing JSON DOM, stripped it out of everything that we weren’t using. Then we changed the linked list support to a List and Dictionary, wrote a few adapters (RavenJTokenReader, etc) and we were off to the races. We were able to utilize quite a large fraction of the things that Newtonsoft.Json already did, we resolved the performance problem and didn’t have to implement nearly as much as I feared we would.

Phew!

Now, let us look at the actual performance results. This is using a 3 MB JSON file:

  • Newtonsoft Json.NET - Reading took 413 ms
  • Using Raven.Json - Reading took 140 ms

That is quite an improvement, even if I say so myself :-)

The next stage was actually quite interesting, because it was unique to how we are using JSON DOM in RavenDB. In order to save the parsing cost (which, even when optimized, is still significant), we are caching in memory the parsed DOM. The problem with caching of mutable information is that you have to return a clone of the information, and not the actual information (because then it would be mutated by the called, corrupting the cached copy).

Newtonsoft.Json supports object cloning, which is excellent. Except for one problem. Cloning is also an O(N) operation. With Raven.Json, the cost is somewhat lower. But the main problem is that we still need to copy the entire large object.

In order to resolve this exact issue, we introduced a feature called snapshots to the mix. Any object can be turned into a snapshot. A snapshot is basically a read only version of the object, which we then wrap around another object which provide local mutability while preserving the immutable state of the parent object.

It is much easier to explain in code, actually:

public void Add(string key, RavenJToken value)
{
    if (isSnapshot)
        throw new InvalidOperationException("Cannot modify a snapshot, this is probably a bug");

    if (ContainsKey(key))
        throw new ArgumentException("An item with the same key has already been added: " + key);

    LocalChanges[key] = value; // we can't use Add, because LocalChanges may contain a DeletedMarker
}

public bool TryGetValue(string key, out RavenJToken value)
{
    value = null;
    RavenJToken unsafeVal;
    if (LocalChanges != null && LocalChanges.TryGetValue(key, out unsafeVal))
    {
        if (unsafeVal == DeletedMarker)
            return false;

        value = unsafeVal;
        return true;
    }

    if (parentSnapshot == null || !parentSnapshot.TryGetValue(key, out unsafeVal) || unsafeVal == DeletedMarker)
        return false;

    value = unsafeVal;

    return true;
}

If the value is on the local changes, we use that, otherwise if the value is in the parent snapshot, we use that. We have the notion of local deletes, but that is about it. All changes happen to the LocalChanges.

What this means, in turn, is that for caching scenarios, we can very easily and effectively create a cheap copy of the item without having to copy all of the items. Where as cloning the 3MB json object in Newtonsoft.Json can take over 100 ms to clone, we can create a snapshot (it involves a clone, so the first time it is actually expensive, around the same cost as Newtonsoft.Json is) and from the moment we have a snapshot, we can generate children for the snapshot at virtually no cost.

Overall, I am quite satisfied with it.

Oh, and in our tests runs, for large documents, we got 100% performance improvement from this single change.

I would like to busy too, but…

Originally posted at 4/13/2011

Yes, I know, you popped this out in the middle of a debug session, after VS didn’t respond to any user input for several minutes.

image

Urgh!

image

AYENDEPC is the local machine! You can’t lose a network connection to yourself.

Oh well, we already know that Visual Studio is schizophrenic.

When race conditions & unbounded result sets are actually the solution – the resolution

Originally posted at 4/13/2011

We discussed the problem in detail in my last post. As a reminder:

The requirement

Starting at a given time, charge as many accounts as fast as possible. Charging each account is a separate action, we don’t have the ability to do batching.

image_thumb2

The first question to ask, however, isn’t how we can get the data out of the database as fast as possible. The first question to ask is actually:

What are the limiting factors.

In this case, we have to make a separate request for each account. That means that we have to access a remote service, and that means we have to figure out whatever we can actually parallelize the work in any way.

Most production systems would limit the number of concurrent calls that a single customer can make (to prevent a customer with a lot of servers from crashing their own systems).

Let us say that we have 100,000 accounts to charge. And let us further say that the total time for a charge operation is in the order of 1 sec. What it means is that whatever we want, we are actually going to be limited by the processing capabilities on the accounts, not our own code.

A silly way of doing this would be something like:

var subscriptions = session.Query<Subscription>().Where(s=>s.NextCharge < DateTime.Now);
foreach(var sub in subscriptions)
{
   sub.ChargeAccount(); // 1 sec
}

This would be silly for several reasons:

  • It would take over a day to complete.
  • It would load a lot of data to memory, probably only to be used in several hours.
  • It puts a lot of strain on our own database and network backbone.

A better alternative would be to try to parallelize this. Assuming that we can perform 5 concurrent requests, we can drop the processing time to just below six hours.

Note where the major savings are, not in how we are actually doing stuff with the code, but rather in how much concurrency we can agree on with the accounts provider. Just to note, let us say that they allow us 25 concurrent requests per second, we are still much better doing chunks of the data all the way rather than trying to process it all at once.

What I would actually do in this scenario is just load the ids of the subscriptions that we need to update into a queue and let a bunch of workers (may be on separate machines) feed off the queue and try to handle as many account charges as they can possibly manage.

When race conditions & unbounded result sets are actually the solution

Originally posted at 4/13/2011

As much as I rile against unbounded result set, I recently run into the first real case where “I need all of the data now” was a valid use case.

The issue is charging micro payments from customers in a particular market. In that market, accounts are usually handled using prepaid codes. So you open an account and load some amount of money into it. Once that money is over, there is no way to charge the account any longer. Users will occasionally re-charge their account. Now, we are talking about recurring billing scenario, where we need to charge users every week some amount of money.

So far, seems pretty reasonable, I hope. The catch is that it is pretty routine for billing to fail for insufficient funds. The strategy to resolve this is to analyze the user’s behavior and retry at likely times when we hope to catch the account with some money in it.

The problem?

Since failures are common, and since most people behave in a similar fashion, what happens is that billing cycles trends to focus on the likely times when we hope to find some money there. In other words, let us say that we noticed that on Sunday evening, most people charge their account…

What this means is that we will have to try to charge those accounts in that time period. To make things interesting, there are other parties as well, which probably also managed to figure out the times when most accounts have money in them. At that point, it actually becomes a race, who will be able to charge the accounts first, and get the money.

The requirement

Starting at a given time, charge as many accounts as fast as possible. Charging each account is a separate action, we don’t have the ability to do batching.

image

How would you solve this problem?

My solution, in the next post…

Find the bug: Why do I get a Null Reference Exception?

Originally posted at 4/12/2011

Can you spot the NRE hiding in the code?

if (parentObject!= null)
{
    lock (parentObject)
    {
        if (parentObject != null)
        {
            properties = properties.Clone();
            parentObject[parentKey] = this;
            parentObject = null;
        }
    }
}

What were you doing with last year logs?

Originally posted at 4/11/2011

I was trying to take a backup of my blog to see if I can do some fancy stuff there, when I realized that the blog backup was 2 GB (!) in size. Now, I know that I post a lot, but I don’t think that I post that much.

As it turns out, the problem was with unbounded logs that started taking up most of the space in the database:

image

In general, when building applications that are meant to run over long periods of time without attention, it is better to have some way of getting rid of unimportant information automatically.

When unit testing a server, why not USE the server?

One of the interesting aspects of build RavenDB was that it opened up my mind to the way we can use the very nature of the server to open up additional information about the server operations.

One thing that I noticed recently is that if I need to debug a test for RavenDB, I often need to stop the current thread (the Test thread) and then literally go to the server test instance and look at what is actually going on in there.

The fun part is that this is really nice, because I can go an inspect the running test instance, see what is going on, modify things to see how they affect the behavior, etc. The only problem is that this is actually quite complex to setup manually (stop on debugger, freeze the appropriate thread, resume run, inspect server – modify stuff there, stop on debugger, thaw thread, continue run, etc).

What occurred to me, however, is that I can codify this behavior, and end up with this:

image

This method will only operate while a debugger is attached, but it is going to save me a long time.  Once I am done, I need to delete the marker document:

image

This is especially important if you are running in and in memory mode, since the moment the test is over, the database is completely wiped out.

Humane dates in RavenDB

Originally posted at 4/4/2011

One of the advantages of RavenDB is that the documents format is highly human readable. Except for one thing. Can you tell me what date is represented by this value:

/Date(1224043200000+0300)/

I thought not :-)

We have recently moved toward this format instead:

2011-04-04T11:28:46.0404749+03:00

Which it both machine and human readable.

This make figuring out what something is so much easier.

Code Review: Who Can Help Me

Originally posted at 3/22/2011

This time, taking on Who Can Help Me, I found some really interesting things there.

For one thing, we got this little guy:

image

I was all ready to discover INewsRepository –<> NewsRepository : Repository<NewsItem>, etc. What I found, instead was a real service:

image

That is how it is supposed to be, to be frank. There is a need to abstract something, and we write just enough code to make it work. I would argue with the implementation of this, however, because the approach for multi threading is wrong headed. We shouldn’t spin out a new thread pool work item just to execute the request when the FluentTwitter API already contains async version that can do quite well for us, but the overall concept is sound. And I was relieved not to find nested repositories in my first few steps there.

Of course, I then discovered that this nice service has some friends from the left side of the blanket:

image

I think that I expressed my opinion about code such as this:

image

A waste of keystrokes, and highly inefficient to boot. You don’t make queries by Id in NHibernate, you use Get or Load instead.

Overloading the infrastructure – After reviewing the code, I was surprised to see so much of it dedicated for caching:

image

What surprised me more is that in the entire application there were exactly two locations where a cache was used. In both cases, it led to the same service. Implementing a much simpler solution in that service would have chopped quite a bit of code out of this project.

And then there was this:

image

Luckily, this is dead code, but I was quite amused by this code, in a “shake you head in disbelief” fashion.

First, for a code that is obviously meant to be used in a multi threaded fashion, it is not thread safe. Second, it is actually a memory leak waiting to happen, more than anything else. If you call that method, your items will never freed.

The next is a personal opinion, but I can’t help feeling that this is pretty heavy weight:

image

Well, that is true whenever you are looking at an XSD, but in this case, we just need to expose two properties, and I think that it would have been perfectly fine to stick that into the <appSettings/> section. There are actually several places where similar approach has been tried, but I don’t see this of any value if you aren’t writing a library that might require special configuration. If you are writing an app, using the default modes is usually more than enough.

Reading the controllers code wasn’t a real surprise. One thing that did bother me is the amount of mapping that is going on there, and how much of that I was simply unable to follow. For example:

image

Which is then calling;

image

Which then goes into additional custom implementations and convention based ones.

The reason that this is important is that this is a prime location for Select N+1 issues, and indeed, we have several such occurrences of the problem just in this piece of code.

Tags:

Published at

Originally posted at

Comments (18)

Performance numbers in the pub

Originally posted at 3/31/2011

I am currently sitting with 3 guys in the pub, and the discussion naturally turned to performance. I asked all three the following question: “How many CLR objects can you create in one second?”

I got the following replies:

  • 2,000 objects per second
  • 50,000 objects per second
  • 100,000 objects per second

Then I sat down to check:

class Program
{
    static void Main(string[] args)
    {
        var sp = Stopwatch.StartNew();

        int i = 0;
        while(sp.ElapsedMilliseconds < 1000)
        {
            new MyClass();
            i++;
        }
        sp.Stop();
        Console.WriteLine("Created {0} in {1}", i, sp.Elapsed);
    }
}

public class MyClass
{
    public string A;
    public int B;
    public DateTime C;
}

The result?

Created 7,715,305 in 00:00:01

The seven deadly sins for the developer (* some restrictions apply)

Originally posted at 3/30/2011
I have seen all of those in production code…
  1. throw new NullReferenceException();
  2. public class AbstractController : Controller
    // or 
    // public class AbstractPag : Page
    {
        public static bool IsAdmin { get;set; }
    }
  3. public static class BlogMgr
    {
        public static void AddPostToBlog(Blog blog, Post post)
        {
                blog.Posts.Add(post);
                post.Blog = blog;
        }
    }

* I could only think of three, but I didn’t want to modify the title. Can you complete the list?

Tags:

Published at

Originally posted at

Comments (65)

Refactoring toward frictionless & odorless code: What about transactions?

Originally posted at 3/30/2011

One thing that we haven’t done so far is manage transactions. I am strongly against having automatic transactions that wrap the entire request. However, I do like automatic transactions that wrap a single action. We can implement this as:

 

public class NHibernateActionFilter : ActionFilterAttribute
{
    private static readonly ISessionFactory sessionFactory = BuildSessionFactory();

    private static ISessionFactory BuildSessionFactory()
    {
        return new Configuration()
            .Configure()
            .BuildSessionFactory();
    }

    public override void OnActionExecuting(ActionExecutingContext filterContext)
    {
        var sessionController = filterContext.Controller as SessionController;

        if (sessionController == null)
            return;

        sessionController.Session = sessionFactory.OpenSession();
        sessionController.Session.BeginTransaction();
    }

    public override void OnActionExecuted(ActionExecutedContext filterContext)
    {
        var sessionController = filterContext.Controller as SessionController;

        if (sessionController == null)
            return;

        using (var session = sessionController.Session)
        {
            if (session == null)
                return;

            if (!session.Transaction.IsActive) 
                return;

            if (filterContext.Exception != null)
                session.Transaction.Rollback();
            else
                session.Transaction.Commit();
        }
    }
}

This one is pretty simple, except that you should note that we are checking if the transaction is active before trying something. That is important because we might have got here because of an error in the opening the transaction, we would get a second error which would mask the first.

Tags:

Published at

Originally posted at

Comments (32)

Refactoring toward frictionless & odorless code: Getting rid of globals

Originally posted at 3/30/2011

While I don’t really mind having global state, it tends to bite you on the ass eventually, so let us try to deal with this guy, shall we?

public class NHibernateActionFilter : ActionFilterAttribute
{
    private static readonly ISessionFactory sessionFactory = BuildSessionFactory();

    public static ISession CurrentSession
    {
        get { return HttpContext.Current.Items["NHibernateSession"] as ISession; }
        set { HttpContext.Current.Items["NHibernateSession"] = value; }
    }

    private static ISessionFactory BuildSessionFactory()
    {
        return new Configuration()
            .Configure()
            .BuildSessionFactory();
    }

    public override void OnActionExecuting(ActionExecutingContext filterContext)
    {
        CurrentSession = sessionFactory.OpenSession();
    }

    public override void OnActionExecuted(ActionExecutedContext filterContext)
    {

        var session = CurrentSession;
        if (session != null)
        {
            session.Dispose();
        }
    }
}

One easy way to do so would be to put the session directly where we want it to be, in the controller. We already have an extension point for that, the SessionController. Instead of referencing the global session, it can just hold its own:

public class SessionController : Controller
{
    public HttpSessionStateBase HttpSession
    {
        get { return base.Session; }
    }

    public new ISession Session { get; set; }
}

Which leads us to:

public class NHibernateActionFilter : ActionFilterAttribute
{
    private static readonly ISessionFactory sessionFactory = BuildSessionFactory();

    private static ISessionFactory BuildSessionFactory()
    {
        return new Configuration()
            .Configure()
            .BuildSessionFactory();
    }

    public override void OnActionExecuting(ActionExecutingContext filterContext)
    {
        var sessionController = filterContext.Controller as SessionController;

        if (sessionController == null)
            return;

        sessionController.Session = sessionFactory.OpenSession();
    }

    public override void OnActionExecuted(ActionExecutedContext filterContext)
    {
        var sessionController = filterContext.Controller as SessionController;

        if (sessionController == null)
            return;
        
        var session = sessionController.Session;
        if (session == null) 
            return;

        session.Dispose();
    }
}

Now we have absolutely no global state, and yet we have a very easy access to the current session.

There are still a few things missing, can you see them?

Tags:

Published at

Originally posted at

Comments (10)

Refactoring toward frictionless & odorless code: The case for the view model

Originally posted at 3/30/2011

Remember, we are running on an action scoped session, and this is the code that we run:

public class HomeController : SessionController
{
    public ActionResult Blog(int id)
    {
        var blog = Session.Get<Blog>(id);

        return Json(blog, JsonRequestBehavior.AllowGet);
    }
}

The error that we get is a problem when trying to serialize the Blog. Let us look again the the class diagram:

image

As you can see, we have a collection of Users, but it is lazily loaded. When the Json serializer (which I use instead of writing a view, since it makes my life easier) touches that property, it throws, because you cannot iterate on a lazily loaded collection when the session has already been closed.

One option that we have is to modify the code for Blog like so:

public class Blog
{
    public virtual int Id { get; set; }

    public virtual string Title { get; set; }

    public virtual string Subtitle { get; set; }

    public virtual bool AllowsComments { get; set; }

    public virtual DateTime CreatedAt { get; set; }

    [ScriptIgnore]
    public virtual ISet<User> Users { get; set; }

    public Blog()
    {
        Users = new HashedSet<User>();
    }
}

I don’t like it for several reasons. First, and least among them, serialization duties aren’t one of the responsibilities of the domain model. Next, and most important, is the problem that this approach is incredibly brittle. Imagine what would happen if we added a new lazily loaded property. Suddenly, unless we remembered to add [ScriptIgnore] we would break everything that tried to serialize a Blog instance.

I much rather use an approach that wouldn’t break if I breathed on it. Like the following:

public class HomeController : SessionController
{
    public ActionResult Blog(int id)
    {
        var blog = Session.Get<Blog>(id);

        return Json(new
        {
            blog.AllowsComments,
            blog.CreatedAt,
            blog.Id,
            blog.Subtitle
        }, JsonRequestBehavior.AllowGet);
    }
}

By projecting the values out into a known format, we can save a lot of pain down the road.

But wait a second, this looks quite familiar, doesn’t it? This is the view model pattern, but we arrived at it through an unusual journey.

I don’t really care if you are using anonymous objects or named classes with something like AutoMapper, but I do think that a clear boundary make it easier to work with the application. And if you wonder why you need two models for the same data, the avoidance of accidental queries and the usage of the action scoped session is another motivator.

Tags:

Published at

Originally posted at

Comments (11)

Refactoring toward frictionless & odorless code: A broken home (controller)

Originally posted at 3/30/2011

Previous, our HomeController looked like this:

public class HomeController : SessionController
{
    public ActionResult Blog(int id)
    {
        var blog = Session.Get<Blog>(id);

        return Json(blog, JsonRequestBehavior.AllowGet);
    }
}

My model is defined as:

image

Remember that the previous post we have changed the session management from the request scope to the action scope?

Well, that meant that we have just broken this code…

But can you see how?

Tags:

Published at

Originally posted at

Comments (11)

Refactoring toward frictionless & odorless code: Limiting session scope

Originally posted at 3/30/2011

In the previous posts, we have looked at how we are going to setup an application using NHibernate. We set it up using:

    public MvcApplication()
    {
        BeginRequest += (sender, args) =>
        {
            CurrentSession = sessionFactory.OpenSession();
        };
        EndRequest += (o, eventArgs) =>
        {
            var session = CurrentSession;
            if (session != null)
            {
                session.Dispose();
            }
        };
    }

But this code is problematic. It is problematic because the session is open for the lifetime of the request. That is a problem, but probably not because of what you think. The #1 reason for issues like Select N+1 is people accessing lazy loaded properties in the views.

One good way of avoiding that is limiting the session scope only to the action, so when rendering the view, the session is not available. Therefor, every attempt to lazy load, will immediately throw. With ASP.Net MVC, this is very easy:

public class NHibernateActionFilter : ActionFilterAttribute
{
    private static readonly ISessionFactory sessionFactory = BuildSessionFactory();

    public static ISession CurrentSession
    {
        get { return HttpContext.Current.Items["NHibernateSession"] as ISession; }
        set { HttpContext.Current.Items["NHibernateSession"] = value; }
    }

    private static ISessionFactory BuildSessionFactory()
    {
        return new Configuration()
            .Configure()
            .BuildSessionFactory();
    }

    public override void OnActionExecuting(ActionExecutingContext filterContext)
    {
        CurrentSession = sessionFactory.OpenSession();
    }

    public override void OnActionExecuted(ActionExecutedContext filterContext)
    {

        var session = CurrentSession;
        if (session != null)
        {
            session.Dispose();
        }
    }
}

We need to modify the SessionController as well, so it would now be:

public class SessionController : Controller
{
     public HttpSessionStateBase HttpSession
     {
          get { return base.Session; }
     }

     public new ISession Session { get { return NHibernateActionFilter.CurrentSession; } }
}

Of course, we have broken the HomeController now, but we will discuss how in the next post.

Tags:

Published at

Originally posted at

Comments (13)

Refactoring toward frictionless & odorless code: Hiding global state

Originally posted at 3/30/2011

As I mentioned in the previous post, I don’t really like the code as written, so let us see what we can do to fix that. For a start, we have this code:

public class HomeController : Controller
{
    public ActionResult Blog(int id)
    {
        var blog = MvcApplication.CurrentSession.Get<Blog>(id);

        return Json(blog, JsonRequestBehavior.AllowGet);
    }
}

I don’t like the global reference in this manner, so let us see what we can do about it. The easiest way would be to just hide it very well:

public class HomeController : SessionController
{
    public ActionResult Blog(int id)
    {
        var blog = Session.Get<Blog>(id);

        return Json(blog, JsonRequestBehavior.AllowGet);
    }
}

public class SessionController : Controller
{
    public HttpSessionStateBase HttpSession
    {
        get { return base.Session;  }
    }

    public new ISession Session
    {
        get { return MvcApplication.CurrentSession; }
    }
}

So that result in nicer code, the architecture is pretty much the same, but we have a much nicer code for al of our controllers.

We are not done yet, though.

Tags:

Published at

Originally posted at

Comments (24)

Refactoring toward frictionless & odorless code: The baseline

Originally posted at 3/30/2011

This is part of an exercise that I give in my course. The context is an MVC3 ASP.Net application. Here is how we start things:

public class MvcApplication : System.Web.HttpApplication
{
    private static readonly ISessionFactory sessionFactory = BuildSessionFactory();

    public static ISession CurrentSession
    {
        get{ return HttpContext.Current.Items["NHibernateSession"] as ISession;}
        set { HttpContext.Current.Items["NHibernateSession"] = value; }
    }

    public MvcApplication()
    {
        BeginRequest += (sender, args) =>
        {
            CurrentSession = sessionFactory.OpenSession();
        };
        EndRequest += (o, eventArgs) =>
        {
            var session = CurrentSession;
            if (session != null)
            {
                session.Dispose();
            }
        };
    }

    protected void Application_Start()
    {
        AreaRegistration.RegisterAllAreas();

        RegisterGlobalFilters(GlobalFilters.Filters);
        RegisterRoutes(RouteTable.Routes);
    }

    private static ISessionFactory BuildSessionFactory()
    {
        return new Configuration()
            .Configure()
            .BuildSessionFactory();
    }
}

And in the controller, we have something like this:

public class HomeController : SessionController
{
    public ActionResult Blog(int id)
    {
        var blog = MvcApplication.CurrentSession.Get<Blog>(id);

        return Json(blog, JsonRequestBehavior.AllowGet);
    }
}

This code is valid, it works, it follows best practices and I actually recommend using something very similar here.

It also annoyed me when I wrote it now, enough to write a series of blog posts detailing how to fix this.

Tags:

Published at

Originally posted at

Comments (20)