Ayende @ Rahien

It's a girl

Challenge: Find the resource leak

I am not sure if I should be embarrassed or furious about this bug, but I spent a lot of time working on issue. Basically, I had a resource leak in my app. I knew that I had it, and even though it took a while to create a reliable reproduction, I managed to narrow it down to a simple reproduction. It was obvious that the issue is leaking resources during the shutdown phase of the application as a result of a race condition.

Now, usually we don’t care about such things because shutting down will, obviously, clear all of those nasty things anyway. Unfortunately, my scenario is to start & shutdown many times in a single process, as part of a unit test. So I had to figure out what was actually going on. The maddening part is that the code should work.

Here is a close approximation of the code in question:

public class QueueActions : IDisposable
{
    UnmanagedDatabaseConnection database;
    public string Name { get; private set; }

    public class QueueActions( UnmanagedDatabaseConnectionFactory factory)
    {
         database = factory.Create();
         database.Open(()=> Name = database.ReadName());
    }

   // assume proper GC finalizer impl

    public void Dispose()
    {
          database.Dispose();
    }
}

And the code using this:

using(var factory = CreateFactory())
{
   ThreadPool.QueueUserWorkItem(()=>
   {
          using(var actions = new QueueActions(factory))
          {
               actions.Send("abc");     
          }
    });
}

Note that this isn’t remotely like the actual code, it is highly simplified

And yes, there is an intentional race condition here.

Here are two pieces of information that are important to remember about this unmanaged library:

  • The factory will not close if there are opened connections (exception)
  • Trying to open a connection after the factory was closed will fail (exception)

How many problems does the code above have?

Comments

Demis Bellot
02/10/2010 10:17 AM by
Demis Bellot

Looks like you could get both exceptions:

1) the factory could dispose before the connection is open and

2) the connection could be open and 'Send()'ing while the factory is trying to close.

Not sure why you would spawn a thread at that point though.

Rafal
02/10/2010 10:26 AM by
Rafal

One of problems is the use of FactoryFactoryFactory pattern.

Felix
02/10/2010 10:39 AM by
Felix

You can have:

An exception on closing the factory

An exception on the costructor of QueueAction

An exception on the send

An exception disposing the QueueAction

So at a first glance there is more possible exception than lines ;)

Making a singleton thread safe Factory does not solve all the problem ?

Daniel
02/10/2010 11:32 AM by
Daniel
*  The factory will not close if there are opened connections (exception)

* Trying to open a connection after the factory was closed will fail (exception)

Both of these can happen, and I don't see any other possible problems (assuming the Unmanaged* classes are thread-safe -- if they aren't, then anything might happen).

The 'send' call cannot fail since the factory won't close if there are open connections. Disposing the connection also shouldn't fail according to your description.

So I'd say this code will always do one of the following:

1) crash when opening the connection (factory.Create() call)

2) crash when closing the factory (factory.Dispose() call)

3) run successfully

Where's the resource leak?

Well obviously, if closing the factory fails with an exception, it'll stay open until the GC calls the finalizer. If that doesn't correctly handle the case of a failed close attempt, you might end up with a permanent resource leak.

Bert Van Steen
02/10/2010 12:34 PM by
Bert Van Steen

The QueueActions (and associated database connection) won't be disposed if an exception is thrown in the constructor.

It's also not a best practice to do expensive calls (like opening connections) in a constructor.

Justice
02/10/2010 12:53 PM by
Justice

@Ayende ... you gave it all away!

The code

  1. creates a factory

  2. queues a thread to

2a. create a connection

2b. close the connection

  1. closes the factory

But the factory could easily be closed before the queued thread is started or the thread attempts to create a connection, resulting in an exception on creating a connection. Additionally, the factory could easily be closed after the queued thread creates a connection, resulting in an exception on closing the factory.

The solution is either to put the creation of the factory into the queued thread, or to join the outer thread to the queued thread (or use other signalling mechanisms such as WaitHandles) so that the factory is closed in the outer thread only once the queued thread has run to completion, at least of the connection-oriented work.

Kaare Skovgaard
02/10/2010 12:55 PM by
Kaare Skovgaard

My guess would be that it is the fact that you're using the ThreadPool.QueueUserWorkItem - INSIDE your using block. It looks to me like the outer using statement is likely to dispose of the factory before the inner using block is ever ran. Now if that does not happen - seeing as this code is now running in 2 different threads - the factory.Dispose() could be called before actions.Dispose() (both by the end of the using blocks when expanded into try/finally) and that could also cast an exception.

That's just my idea anyway.

Rich
02/10/2010 02:25 PM by
Rich

If database.Open throws an Exception then the reference to the UnmanagedDatabaseConnection instance is lost... database.Dispose() will never be called as the Dispose method of QueueActions cannot be called (an exception will be thrown from the QueueActions constructor so there's no instance of QueueActions to call Dispose on).

Not really worked much with unmanaged code directly but I'm guessing that the CLR has no way to clean this up, so the UnmanagedDatabaseConnection will live on in memory after the application ends.

Michal
02/10/2010 03:15 PM by
Michal

The factory can get disposed during the QueueActions constructor. If this happens after calling

database = factory.Create();

but before

database.Open(()=> Name = database.ReadName());

.. then opening the connection will throw Exception. This will cause QueueActions.Dispose() to never be called and the database will leak.

zihotki
02/10/2010 03:37 PM by
zihotki

I think that in order to fix bug you should move using block for a factory inside of the ThreadPool.QueueUserWorkItem action. At the time of execution of an item in the ThreadPool queue the factory for this item may be already disposed or this factory will be disposed (with an exception) after the creation of a QueueActions instance but before disposing it, so there will be an opened connection.

Frank Quednau
02/10/2010 09:35 PM by
Frank Quednau

depending on what you do with the lambda in the Open call, you'd stop to have full control about the lifetime of the database and the QueueActions since they both need to be captured for the lambda to work, but I don't know if that's an issue here...

OmariO
02/11/2010 02:10 AM by
OmariO

using(var factory = CreateFactory())

{

ThreadPool.QueueUserWorkItem(()=>

{

      using(var actions = new QueueActions(factory))

      {

           actions.Send("abc");     

      }

});

}

It is possible that before the ThreadPool work item will have a chance to start execution the "main" main thread will have disposed the factory. Let start with this scenarion

  1. As there are no open connection yet disposing the factory will succeed

  2. When constructor of QueueActions class tries to create connection an exception will be thrown because the factory will have been disposed to the time

  3. As the exception is thrown in the constuctor the reference on the QueueActions is not assigned to "actions" local variable and consecuently the object won't be disposed

  4. Uncaught exception in the thread pool's thread will lead to the AppDomain unload whcih will run finalizer for QueueActions which in turn will call QueueActions.Dispose method

  5. As the factory failed to create a connection and threw exception the QueueActions object will have null value in its "database" member variable = > NullReferenceException in the finalizer thread

I would suggest you not to throw exceptions from constructors and to remove the finalizer from QueueActions class as is doesnt deal DIRECTLY with an unmanaged resource (disposing disposables in the Dispose method is enough in). Having finaliser in order to to deal with exceptional constructors... smells a bit.

But we still havent found the leak of connections as in the aforementioned scenario a connection event were not created.

Th second scenario I can see is:

  1. The main thread places the work item in ThreadPool and preempted by a thread pool thread that starts excecution of the the work item

  2. The thread pool thread in the constructor of QueueActions executes factory.Create() and preemted by main thread before the created connection assigned to "database" member

  3. The factory doesn't close because there is a created connetion and throws an exception.

  4. In the rpocess of unloading the AppDomain a thread abort is injected in the thread pool thread exactly before the assignment to "database" memeber =>

created QueueActions object is not disposed and when it finalizer is executed database == null

  1. connection leaked!

"Last but not least" sorry for my English :)

Comments have been closed on this topic.