Runtime code compilation & collectible assemblies are no go

time to read 5 min | 875 words

The problem is quite simple, I want to be able to support certain operation on Raven. In order to support those operations, the user need to be able to submit a linq query to the server. In order to allow this, we need to accept a string, compile it and run it.

So far, it is pretty simple. The problem begins when you consider that assemblies can’t be unloaded. I was very hopeful when I learned about collectible assemblies in .NET 4.0, but they focus exclusively on assemblies generated from System.Reflection.Emit, while my scenario is compiling code on the fly (so I invoke the C# compiler to generate an assembly, then use that).

Collectible assemblies doesn’t help in this case. Maybe, in C# 5.0, the compiler will use SRE, which will help, but I don’t hold much hope there. I also checked out Mono.CSharp assembly, hoping that maybe it can do what I wanted it to do, but that suffer from the memory leak as well.

So I turned to the one solution that I knew would work, generating those assemblies in another app domain, and unloading that when it became too full. I kept thinking that I can’t do that because of the slowdown with cross app domain communication, but then I figured that I am violating one of the first rules of performance: You don’t know until you measure it. So I set out to test it.

I am only interested in testing the speed of cross app domain communication, not anything else, so here is my test case:

public class RemoteTransformer : MarshalByRefObject
{
    private readonly Transformer transfomer = new Transformer();

    public JObject Transform(JObject o)
    {
        return transfomer.Transform(o);
    }
}

public class Transformer
{
    public JObject Transform(JObject o)
    {
        o["Modified"] = new JValue(true);
        return o;
    }
}

Running things in the same app domain (base line):

static void Main(string[] args)
{
    var t = new RemoteTransformer();
    
    var startNew = Stopwatch.StartNew();

    for (int i = 0; i < 100000; i++)
    {
        var jobj = new JObject(new JProperty("Hello", "There"));

        t.Transform(jobj);

    }

    Console.WriteLine(startNew.ElapsedMilliseconds);
}

This consistently gives results under 200 ms (185ms, 196ms, etc). In other words, we are talking about over 500 operations per millisecond.

What happen when we do this over AppDomain boundary? The first problem I run into was that the Json objects were serializable, but that was easy to fix. Here is the code:

 static void Main(string[] args)
 {
    var appDomain = AppDomain.CreateDomain("remote");
    var t = (RemoteTransformer)appDomain.CreateInstanceAndUnwrap(typeof(RemoteTransformer).Assembly.FullName, typeof(RemoteTransformer).FullName);
    
    var startNew = Stopwatch.StartNew();
     
     for (int i = 0; i < 100000; i++)
     {
         var jobj = new JObject(new JProperty("Hello", "There"));

         t.Transform(jobj);

     }

     Console.WriteLine(startNew.ElapsedMilliseconds);
 }

And that run close to 8 seconds, (7871 ms). Or over 40 times slower, or just about 12 operations per millisecond.

To give you some indication about the timing, this means that an operation over 1 million documents would spend about 1.3 minutes just serializing data across app domains.

That is… long, but it might be acceptable, I need to think about this more.