The hidden costs of allocations
I mentioned in a pervious post that we have a code quality gateway to make sure that all logging statements are wrapped in an if statement, to reduce the number of allocations when the user has logging turned off. This is done because logging can be expensive, and is often turned off, so there is no point in paying a penalty for stuff that isn’t working.
There seems to be some confusion about why this is done. Let us assume that we have the following logging code:
void Debug(string msg) { if(IsDebugEnabled) Console.WriteLine(msg); } void Debug(string format, params object[] args) { if(IsDebugEnabled) Console.WriteLine(format, args); } void Debug(Func<string> generateMsg) { if(IsDebugEnabled) Console.WriteLine(generateMsg()); }
Now, the obvious bad example would be to use:
Debug("Hello "+ user.Name);
Since that is going to allocate a new string, and this will happen regardless of whatever logging is enabled or not. On high frequency call sites, this can end up allocating a lot of useless stuff.
So we will move to this mode:
Debug("Hello {0}", user.Name);
And we saved the allocation, right? Except that this actually generate this code, do you see the allocation now?
Debug("Hello {0}", new[] { user.Name });
So let us introduce a better option, shall we? We’ll add a few common overloads without the use of params.
void Debug(string format, object p1); void Debug(string format, object p1, object p2); void Debug(string format, object p1, object p2, object p3);
And now we saved the allocation. Unless…
int requestNumber = ...; Debug("Request # {0}", requestNumber);
Do you see the allocation now? We pass an int to a object, which require us to do boxing, which is an allocation .
So let us try using the lambda method, this way nothing is executed!
Debug(() => return "Hello " + user.Name);
Except… this is actually translated to:
var loggerMsg = new LoggerMessage(user); Func<string> func = new Func<string>(loggerMsg.Write); Debug(func);
Here are all the allocations.
There is another issue with logging via lambdas, consider the following code:
void Index(JsonDocument[] docs) { var batch = new IndexBatchStats(); database.Stats.Add(batch);// long lived batch.Completed += () => database.Stats.IncrmentCompletedBatches(batch); Log(() => "Indexing " + docs.Length + " documents"); }
You might notice that we have two lambdas here. C# is optimizing the number of types generated, and will generally output a single type for all the lifted member in all the lambdas in the method. This means that we have:
void Index(JsonDocument[] docs) { var batch = new IndexBatchStats(); database.Stats.Add(batch);// long lived var args = new { database, batch, docs }; // all lifted members batch.Completed += (args) => args.database.Stats.IncrmentCompletedBatches(args.batch); Log((args) => "Indexing " + args.docs.Length + " documents"); }
As you can see, we have a long lived lambda, which we think is using only other long lived objects (database & batch), but is actually holding a reference to the docs, which are VERY large.
Except for the last issue, which require moving the logging to a separate method to avoid this optimization, all of the issues outlined above can be handled by explicitly calling IsDebugEnabled in the calling code.
And this is why we require it.
Comments
Thanks for clarification - i actually thought params just pretends to put argumetns in an array, but it actually does ;) BTW after all these rounds of low-level optimizations, don't you think CLR is only standing in the way (at least for the crucial low-level stuff)?
Rafal, There are similar tradeoffs in pretty much any language / platform. Those kind of things aren't too hard for us to deal with. The things that are painful is that we don't get a better way to control memory. For example, I would really love this: https://github.com/dotnet/coreclr/issues/1235
You seem to say it in the last sentence but would it not be better to wrap all your debugcalls into an if debugEnabled? Solves all those gotchas.
int requestNumber = ...; if (Debug.IsEnabled) { Debug("Request # {0}", requestNumber); }
// Ryan
Whoah, formatting got screwed up ...
// Ryan
What about generics with overloading?<br /> Instead of
use
?<br /> As far as I know, code will be bigger (because e.g. primitive data types like int, double need their own generated code), but this should avoid allocations for boxing...
I wonder if this is a good usecase for ConditionalAttribute (https://msdn.microsoft.com/en-us/library/aa664622(v=vs.71).aspx)? It seems that it gives you all performance benefits of if (isDebug) and also is much more easier/less cumbersome to use.
On the other hand enabling/disabling of logging must be done at compile time (and you need full rebuild).
What about Debug("Request # {0}", requestNumber.ToString());
Ryan, Yes, that is a continuation of the previous post, which said that we had a check for this sort of things
Jan, You can do it with generics, yes, but it has its own costs, as you noted. It also doesn't help if the things you need are more complex.
If I need to compute something to log it, for example
meow, Conditional is great, but re-compiling is not something that we can allow
Have you considered Fody for doing this work instead of manually adding log level checks to code?
If you want a way to pick up these "hidden allocations" whilst writing/reading code, you should check out the "Roslyn Clr Heap Allocation Analyzer" https://github.com/mjsabby/RoslynClrHeapAllocationAnalyzer.
I know you can profile you code and see them that way, but sometimes it's easier to see the issues when you are writing the code.
Stan, Yes, we did consider that, but Fody doesn't help for the more complex stuff. We have logging stuff that takes multiple statements to prepare
Matt, We use a R# addon that does the same
Or use a real logging library that short circuits and early exits any logging calls if the log level is lower than the desired line. It basically does what Ryan Health said.
Kurt, Um... nope. What we are showing here is the cost of calling the log library, not the cost of the log library itself
I know what you meant by this post. For instance, Boost::Logging short circuits any calling costs.
For example, in pseudo-code:
logger.set_level(info); logger.debug_msg( "sleep for 3 seconds %d", sleep(3) ); // this returns instantly, doesn't sleep
Kurt, Is this done during compilation, or at runtime? If the later, how does debug_msg declared?
Good write up! I don't quite follow the section below however. </br> </br> var loggerMsg = new LoggerMessage(user);</br> Func<string> func = new Func<string>(loggerMsg.Write);</br> Debug(() => return "Hello " + user.Name);</br> </br> Why would C# initialize a function that it won't be using anywhere? I'm a bit uncertain why this LoggerMessage class is being created too.</br> </br> Surely it would do something like this instead?</br> Func<string> func = () => "Hello " + user.Name;</br> Debug(func);</br>
Nathan, That was a typo on my part, I fixed it in the post
Comment preview