My Passover Project: Introducing Rattlesnake.CLR
Okay, after spending quite a lot of time digging through the leveldb codebase, and with several years of working with RavenDB, I can say with confidence that the CLR make it extremely hard to build high performance server side systems using the CLR.
Mostly, the issues are related to GC and memory. In particular, not having any way to control memory allocation and/or the GC means that we can’t optimize those scenarios in any meaningful way. At the same time, I do not want to go back to the unmanaged world. As mentioned ,I just came back from a very deep dive into a non trivial C++ codebase ,and while I consider that codebase a really good one, that ain’t to say it is a pleasure to always be thinking about all the stuff that the CLR just takes away.
Therefor, I decided that I’m going to be doing something about it. And Rattlesnake.CLR was born:
The major features of the Rattlesnake.CLR include explicit memory management when required. Let us say that we know that we are going to be needing some amount of memory for a while, and then all of that can be thrown away. This is extremely common in scenarios such as a web request, pretty much all the memory that you generate during the processing web request can be safely free immediately. In RavenDB’s case, the memory we consume during indexing can be free immediately when we stop indexing. Right now this is a painful process of making sure that we allocate within the same gen0 and hoping that it won’t be too expensive, or that we won’t get a complete halt of the entire server while it is releasing memory. It also make it really hard to do things like limit the amount of memory your code uses.
Another requirement that I have is that Rattlesnake.CLR should be able to execute existing .NET assemblies without any additional steps. Since I don’t fancy doing ports of stuff that already exists.
In order to handle this scenario with the given constraints, we have:
1: var heap = Heap.Create(HeapOptions.None,
2: 1024 * 1024,
3: 512 * 1024 * 1024);
4:
5: using(MemoryAllocations.AllocateFrom(heap))6: {
7: var sb = new StringBuilder();8: for(var i = 0; i < 100; i ++ )9: sb.AppendLine(i);
10: Console.WriteLine(sb.ToString());
11: }
12:
13: heap.Destroy();
All the code within the using statement is allocated in our own heap. In line 13, we are destroying all of that memory in one fell swoop.
There are a few notes about this that we probably should address:
- By default, memory allocated by this form is not subject to any form of GC. The idea is that this whole heap is getting released immediately.
- Note that last two parameters for the Heap.Create. The first is the initial size of the heap, and the second is the max size. We now have a real way to actually limit the amount of memory a piece of code will use. This is really important on server applications where avoiding paging is critical.
- For that matter, we can now figure out how much memory a particular piece of code uses, and allocate our resources accordingly.
- You can use multiple heaps at the same time, although only one can be installed as the default allocation at a given point in time.
There is the explicit heap.GarbageCollect() method that will do GC only on that heap, and which you can schedule at your own convenience. You can have two heaps, and allocate from one while you are GCing from the other. And yes ,that means that GCs using this methods will not stop the process!
Memory allocated on the heap is obviously only valid as long as the heap is valid. That means that once the heap is destroyed, you can’t access any of the objects that were created there. This has implications for things like cache. We provide MemoryAllocations.AllocateOnGlobalHeap<T>(args) method to force you to use the global heap, instead, if you want this memory to be always available and subject to GC.
This is early days yet, but we already see some really interesting performance improvements!
How does this work?
While an early experiment with Rattlensake.CLR was based on the Mono runtime. I quickly decided that I wanted to keep using the MS CLR. Now, it order to handle this I had to do some unnatural things (to say the least), but I think that I even managed to make this a supported option. Essentially, we are using the CLR Hosting API for this. In particular:
- ICLRGCManager
- IHostMalloc
- IHostMemoryManager
You can use Rattlesnake.CLR like this:
.\Rattlesnake.exe Raven.Server.exe
Just for fun, we also allowed to place limits on the default heap, so you can be sure that you aren’t allocating too much there.
.\Rattlesnake.exe Raven.Server.exe --max-default-heap-size=256MB
We are still running some tests, but this is looking really good.
Comments
This idea makes me think a bit about region based memory management.
What happens with references from the standard heap or stack to objects inside those special heaps? You need to keep track of all references from outside such a special heap to objects inside such a special heap for being able to garbage collect those special heaps. And if that is the case, I wonder why you do not need to stop the process...? Also, what happens with references from outside the special heap to objects inside a special heap that gets destroyed?
Ruben, Can you talk more baout region based memory management? References to memory in the heap are invalidated (and user is reponsible for them) when the heap is destroyed. Dereferencing those points cause an exception.
Maybe it's just my dirty mind, but the... ahem... 'head' and.... 'tongue' of the logo looks like... well, a jizz cock!!
Traditionally for memory management in vms like the CLR and the JVM, there are 2 important areas that are used for memory allocations. On the one hand the stack and on the other hand the heap.
The stack is used for putting the parameters of a method call. Each time you do a method call, a stack frame is created on the stack, and the parameters for the method are put on the stack. The mechanism is very simple: you enter a method, a new stack frame is created; you exit the method and the stack frame is freed. The allocation is directly linked to the scope in the code: the method parameter itself is only used in the method, and can be freed when you exit the method.
The heap is used for storing objects. Any C# (or Java) object that is created, is put on the heap. If you would look at the heap in terms of objects, you'd see a complex object graph with objects linked to each other through references. The lifetime of an object in the heap is not clear. The advantage of automatic memory management, and thus also of garbage collection, is that the software itself automatically determines the lifetime of an object.
Most garbage collectors, however, can only "approach" the lifetime of an object. Most of them are based on the fact that if an object is not reachable through any reference (direct or indirect), them the life of the object is finished, since it can no longer be used from the program, and the memory that is occupied by it, can be freed and reused for another object. The only way that a garbage collector can determine that an object is no longer reachable, is by following all existing references (starting from the "roots") and checking that none of those refer to that object. This is a costly process.
Now, what if we could use static program analysis and determine that the lifetime of a certain object is actually tied to a specific method? Very simple example: the method creates an object, but does not actually use it. The static analysis can determine that no reference to the object is passed anywhere else and that the lifetime of the object is tied to the method (this is called escape analysis). In that case, the object can be allocated inside the stack frame and can be freed when the stack frame is released. (This is actually implemented in the JVM, not sure about the CLR.)
Region based memory management does something similar. The idea behind it is that it is possible through static analysis to determine the lifetime of several objects. Instead of allocating those objects on the heap, it seems more performant to allocate objects with the same lifetime inside a piece of memory, called a region. Through static analysis, we can automatically determine when that piece of memory has to be allocated, and when it can be freed (just like what happens with a stack frame). This piece of memory is automatically allocated and released, but the handling of it is a lot cheaper than with garbage collection.
The heaps you describe, could be seen as regions, where the control of the lifetime of the region is left to the programmer. If you implement this, you would need a write barrier because every reference that is written from a stack frame or from the heap to a "region" needs to be recorded. You say that references to a "region" are invalidated... this is entirely possible, but you need the write barrier to mark those references after a "region" is invalidated. The good thing though is that current vms already have mechanisms for write barriers, since write barriers are also needed for generational garbage collection (for references from old generations to younger generations). However, if you would actually need to invalidate a reference, that would mean that the developer made a mistake and did not determine the lifetime of the "region" correctly....
Ruben, I am not sure what you mean by write barrier. The one that I am familiar with is basically the CPU write barrier, used for multi threaded access. When I am talking about invalidating the references to a released heap, I am talking about literally something that would throw "InvalidAccessException", (possibly crashing the entire app). Not something that would go and change existing references.
This idea strongly reminds me of the Sun's Java Real-Time System (http://www.oracle.com/technetwork/java/javase/tech/index-jsp-139921.html) and the corresponding "memory management" part of the JSR-1 spec.
Daniel, I lack knowledge / expertise for this.
Uvw, Yes, scoped memory is pretty much what I am talking about here.
Is this an April fools joke!!!!!! HOLY COW!!!!!! If not, then that is awesome, if it is a joke... MAN YOU GOT ME GOOD!!!
This looks promising. At our company we have a large server application that eats up a over a gig of memory. Concerned sys admins complain about the amount of memory being consumed. We've tried to alleviate the problem by forcing GC.Collect() at certain times when we know large amounts of memory can be released, but we cannot simply do this after each service call. We also have a very similar problem in our UI, which is another behemoth with hundreds of screens and many tabs. We essentially end up calling GC.Collect every time a user closes a tab, in order to prevent the memory usage from ratcheting up (users open and close tabs way faster than the GC keeps up). I'm pretty sure that these are not good practices, but what choices do we have?
We could really utilize Rattlesnake for these issues.
+1 on this been April fools joke, and at the same time won't be much surprised if is not.
Well, I happen to strongly agree with the opening statement from this article, which is one of the reasons I am assuming this is not a joke. If it is a joke, I will be pretty embarrassed :)
The great thing about Ayende: projects so crazy, they just might be real. :-)
I'm pretty sure there's no April Fools day in Israel, and I didn't see any Ayende April Fools from 2012, so I'm going to say this is legit!
Ayende, but you can only invalidate references if you know the original location of the references to the released heap? And you have to invalidate them when you release that heap? If you allocate a new heap in the same location of the released heap... the old references would possibly still be valid but point to something new...
Dan, explicitly triggering a garbage collection is almost always a bad idea. It depends on what exactly the GC.Collect call does, but according to MSDN it triggers an immediate collection of all generations, which is pretty costly. The more memory you give a garbage collector, the less frequently it will be forced to collect and the less time (overall) will be spent on garbage collection.
Ruben, Good point on the new heap being in the same location. That would have to be double indirect, I guess.
It would have been hilarious if you could have gotten Maoni or Patrick to corroborate your little ruse on their blogs.
Ayende, well.. that or a write barrier.. :) With the current generational collector in the CLR there must already be a write barrier that records references from the older generations to the newer generations. When collecting only the newer generations, the objects inside those newer generations move around and as a consequence, references to those objects originating in the older generations must also be adapted. That mechanism is already in place in the CLR. But in your case, double indirect might be easier in fact. In the end it comes down to the same thing: you must keep track of references from outside to objects inside those heaps.
No github link? It got to be an April fools joke.
a very interesting idea indeed. I've never had a though that there should /could be an interface for allocating memory for a GC, and if we have those interfaces, we can do something clever about it.
Where's the link to Github btw :)
If this is a joke, then you should also implement a RattleSnake C# language feature that allows a placement new operator similar to C++ on your custom memory heap regions..
var sb = new (heap1) StringBuilder();
Ajai
This reminds me of the AutoReleasePool in objective-C
Comment preview