Runtime code compilation & collectible assemblies are no go
The problem is quite simple, I want to be able to support certain operation on Raven. In order to support those operations, the user need to be able to submit a linq query to the server. In order to allow this, we need to accept a string, compile it and run it.
So far, it is pretty simple. The problem begins when you consider that assemblies can’t be unloaded. I was very hopeful when I learned about collectible assemblies in .NET 4.0, but they focus exclusively on assemblies generated from System.Reflection.Emit, while my scenario is compiling code on the fly (so I invoke the C# compiler to generate an assembly, then use that).
Collectible assemblies doesn’t help in this case. Maybe, in C# 5.0, the compiler will use SRE, which will help, but I don’t hold much hope there. I also checked out Mono.CSharp assembly, hoping that maybe it can do what I wanted it to do, but that suffer from the memory leak as well.
So I turned to the one solution that I knew would work, generating those assemblies in another app domain, and unloading that when it became too full. I kept thinking that I can’t do that because of the slowdown with cross app domain communication, but then I figured that I am violating one of the first rules of performance: You don’t know until you measure it. So I set out to test it.
I am only interested in testing the speed of cross app domain communication, not anything else, so here is my test case:
public class RemoteTransformer : MarshalByRefObject { private readonly Transformer transfomer = new Transformer(); public JObject Transform(JObject o) { return transfomer.Transform(o); } } public class Transformer { public JObject Transform(JObject o) { o["Modified"] = new JValue(true); return o; } }
Running things in the same app domain (base line):
static void Main(string[] args) { var t = new RemoteTransformer(); var startNew = Stopwatch.StartNew(); for (int i = 0; i < 100000; i++) { var jobj = new JObject(new JProperty("Hello", "There")); t.Transform(jobj); } Console.WriteLine(startNew.ElapsedMilliseconds); }
This consistently gives results under 200 ms (185ms, 196ms, etc). In other words, we are talking about over 500 operations per millisecond.
What happen when we do this over AppDomain boundary? The first problem I run into was that the Json objects were serializable, but that was easy to fix. Here is the code:
static void Main(string[] args) { var appDomain = AppDomain.CreateDomain("remote"); var t = (RemoteTransformer)appDomain.CreateInstanceAndUnwrap(typeof(RemoteTransformer).Assembly.FullName, typeof(RemoteTransformer).FullName); var startNew = Stopwatch.StartNew(); for (int i = 0; i < 100000; i++) { var jobj = new JObject(new JProperty("Hello", "There")); t.Transform(jobj); } Console.WriteLine(startNew.ElapsedMilliseconds); }
And that run close to 8 seconds, (7871 ms). Or over 40 times slower, or just about 12 operations per millisecond.
To give you some indication about the timing, this means that an operation over 1 million documents would spend about 1.3 minutes just serializing data across app domains.
That is… long, but it might be acceptable, I need to think about this more.
Comments
In my experience, WCF using a NetNamedPipesBinding is much faster than remoting. The new default configuration feature in .Net 4.0 makes it pretty painless to use.
Can you do that with expressions?
like weblogs.asp.net/.../...-dynamic-query-library.aspx ...
Hi Ayende
Have you already considered the dynamic query sample that ships with VS2008? ( http://msdn.microsoft.com/en-us/library/bb397982(VS.90).aspx)
I had used it some time back, there was some restriction on it only supporting method calls on some basic types, but very easy to around that :-)
It does compile expressions into a dynamic method which I think is ideal for the Raven scenario.
I also noticed a reference to NRefactory (LInqPAD uses it I think) somewhere, haven't used it but assume you could walk it's AST and transform 1:1 to a expression tree & compile to lambda?
Ajai
Looks like you might end up using (or likely writing, given your history ;)) a C# parser to generate an AST you can translate in an expression tree.
Alternatively, could you avoid repeated calls to the other app domain by making it get the documents to process, rather than sending each one over?
What happens if you try the same with two different assemblies for the Transformer and RemoteTransformer classes? I'm thinking there may be some smart optimization going on in the simple case since your loop is pretty simple.
Oh, scratch that. They're both on the same (remote) appdomain...
One obvious thing that doesn't work is batching the calls, i.e. passing a list of values to be transformed so that the interface is not so 'chatty' - quick experiment showed that it only improved things by <10%
Here's what I had to do in a (not very) similar case:
I took the code that was static and would normally be communicating across the appdomain boundary and injected it into each appdomain when built. Then I had a slim appdomain manager that took requests and routed them to the appdomains to be worked on entirely there.
the downside to this is extra code in every appdomain, and writing the code to inject my base compiled code into them.
The tricky thing was just finding the right spot to put the boundary - make all of the work into a single cross-boundary call was the best case scenario, but not always possble.
That's nontrivial as you need to implement the C# type system, overload resolution, etc. Even just the parsing part is extremely complex for C# (think of all the ambiguous syntax like "M(a <b,> d(7))" or near-ambiguous like "bool b = a is B?;" vs. "bool b = a is B ? c:d;")
But if you are interested, I wrote some hacky prototype of this a year ago, based on the SharpDevelop C# code-completion system. It's not that hard if you use the right components: SharpDevelop solves many of the nasty issues understanding C# code, and Linq.Expressions solved many of the nasty issues generating IL.
Of course there will be subtle differences to the actual C# semantics, but it might work OK for Ayende's usecase.
For our own usecase, we settled on using csc.exe and living with resulting the memory leak.
I gave the NamedPipe suggestion a try. I'm not a WCF expert so I'm probably doing something wrong / funky with the serializer stuff. But it looks quite slow.
265ms v.s. 25434ms on my machine. So around 100 times slower. But then again.. it could be I'm doing something stupid.
code: http://pastebin.com/mxuydvDf
Hi Ayende,
I've been using the Microsoft's Dynamic sample mentioned above for years in production. It is based on DynamicMethod and so does not generate new assemblies.
However, it does not support C# 4.0 "dynamic".
LL
I have created a Pattern Matching library that parses DSLs or a general purpose programming langauge (of my creation) that looks similar to C# for doing dynamic expressions that get built into collectible assemblies.
Theoretically you could do this with boo also once it's ported to .net 4.
http://metasharp.codeplex.com
example:
metasharp.codeplex.com/.../6635d57d84f1
Actually there are two things going on in that sample. A DSL is parsed into an AST then that AST is passed into this template, this template produces code that gets compiled into linq expressions, which get compiled into a lambda expression and executed, collected etc.
@Justin, the problem is that Raven uses C# 'dynamic' objects. LINQ expressions do not support them directly.
When I started looking at dealing with enums in the linq queries for RavenDB, I came to appreciate the difficulty of what you're trying to do. One problem, as soon as people can use arbitrary linq expressions, they want to mix in arbitrary code from some odd DLL that isn't necessarily on the server. I wonder if after you solve the one problem if you'll still hit another wall.
I wanted to mention some approaches that came to mind, in case you haven't considered them:
1) RavenDB, via MEF, supports adding extensions via DLL copy. I wonder if it would be easier to just require the client to send a DLL to the server that has their queries precompiled. This way they could include arbitrary code. The user has more work to do when they change their queries, but the raven usage model is already such that you're supposed to think about your queries early.
2) If this app domain business is to support the Map/Reduce linq expressions, and not the query expressions, I wonder if the whole indexing business could live in a separate app domain, so you're not crossing app domain boundaries so much.
@fschwiet, MEF directory catalog also leaves assemblies in memory, even if you remove them from the directory and refresh the catalog.
I believe that it does support it actually.
Expression.Dynamic(...);
Should allow you to do the equivalent of the dynamic keyword inside of a linq expression.
msdn.microsoft.com/en-us/library/dd324059.aspx
However the Dynamic sample stuff was created in .net 3.5 time frame and does not include support for a lot of the new linq expressions such as Dynamic and Block.
I tried WCF with NetNamedPipes and got 500ms vs 4000ms. Here is the code http://pastebin.com/MMg5SCup.
Try to find out what is causing the slowdown. Maybe it is erialization (write a custom serializer that is 100 times faster) or maybe it is the infrastructure that you cannot control (use batching to transfer multiple objects).
LambdaExpressions can be compiled to dynamic methods (LExpression.Compile(ILGenerator)). That might help too.
You could also run the whole server in a second appdomain so you do not have to do any marshalling. Then you recycle that domain periodically from the main domain that acts as a coordinator.
It may sound a bit hackish, but with Mono.Cecil it could be possible to extract the ILCode you create by doing the compile you do today. You could then feed the ILCodes into a method of your dynamic assembly, which makes it elligible for collection.
@fschwiet - I think the general idea for Ravan is that an index created as a string should be simple. I believe this was a design decision to make people use simple indexes.
There is the option of compiled indexs, see the Event-Sourcing sample at github.com/.../Raven.Bundles.Sample.EventSourci....
Although the enum case you're see is somewhere inbetween, you shouldn't have to write a compiled index just to get enum's to work.
Assuming dynamic method is what we are after, and speaking of hackish, here is one crazy link: blogs.msdn.com/.../...odinfo-to-dynamicmethod.aspx
Probably a silly question, but why do you kick off a compiler instance instead of using SRE?
Nice use of the 2nd App Domain, its similar to some of the impressive stuff that Second Life is doing with Mono to speed up their scripts:
http://www.youtube.com/watch?v=QGneU76KuSY
Since you're spending so much time Serializing you might want to consider a different serializer @marcgravel's protobuf-net is about 8-9x faster than Microsofts BinaryFormatter. Otherwise if you still want to use JSON you should be able to get a perf boost with my JsonSerializer which is around 3x quicker than the other JSON serializers out there atm: http://www.servicestack.net/mythz_blog/?p=344
Could you serialize the expression trees that you generated in another appdomain? Should be good enough if you don't need to run them directly.
Now you are solving a problem that would not exist if you chose a different technology for your product. An interpreted query / data manipulation language would be much better.
The binary serializer in .net is very slow. You should implement (if you can), your own serializer, which is actually pretty simple (Implement ISerializable and then simply do it in the GetObjectData and the ctor for deserialization to hook your own serialization stuff). Your own serialization code should not branch out to the .net one but should simply create a byte[] which you add to the info object. This is very fast (and also compact in data, most of the time).
See Simon Hewitt's work:
www.codeproject.com/kb/dotnet/FastSerializer.aspx
This might shave off miliseconds per call, so in the end it will benefit greatly
I've never looked at Raven code, but what are JObject and JProperty? Are they MarshalByRef objects? If they are not, changing them may provide the speed you need.
John,
Making them MBRO would actually increase the timing, since you would have a lot more cross app domain chatter.
How about using the IronPython interpreter from c# and pass in the linq query as a python script.
Comment preview