Reducing the cost of getting a stack trace

time to read 6 min | 1100 words

image I am trying to find ways to reduce the cost of the stack trace used in NH Prof. The access to the stack trace is extremely valuable, but there is a significant cost of using it, so we need a better way of handling this. I decided to run a couple of experiment running this.

All experiments were run 5,000 times, on a stack trace of 7 levels.

  • new StackTrace(true) - ~600ms
  • new StackTrace(false) - ~150ms

So right there, we have a huge cost saving, but let us continue a bit.

  • throwing exception - ~400ms

That is not so good, I have to say.

Well, when in doubt, cheat!

Using reflector and some _really_ nasty stuff, I came up with this:

var stackFrameHelperType = typeof(object).Assembly.GetType("System.Diagnostics.StackFrameHelper");
var GetStackFramesInternal = Type.GetType("System.Diagnostics.StackTrace, mscorlib").GetMethod("GetStackFramesInternal",BindingFlags.Static|BindingFlags.NonPublic);
 
var method = new DynamicMethod("GetStackTraceFast",typeof(object),new Type[0],typeof(StackTrace),true);
 
var generator = method.GetILGenerator();
generator.DeclareLocal(stackFrameHelperType);
generator.Emit(OpCodes.Ldc_I4_0);
generator.Emit(OpCodes.Ldnull);
generator.Emit(OpCodes.Newobj, stackFrameHelperType.GetConstructor(new[] { typeof(bool), typeof(Thread) }));
generator.Emit(OpCodes.Stloc_0);
generator.Emit(OpCodes.Ldloc_0);
generator.Emit(OpCodes.Ldc_I4_0);
generator.Emit(OpCodes.Ldnull);
generator.Emit(OpCodes.Call, GetStackFramesInternal);
generator.Emit(OpCodes.Ldloc_0);
generator.Emit(OpCodes.Ret);
getTheStackTrace = (Func<object>)method.CreateDelegate(typeof(Func<object>));

Calling getTheStackTrace 5000 times with depth of 7 frames is… 54ms. And now that is a horse of a different color indeed.

And the best part is, I can use the StackFrameHelper as a key into cached stack traces.

And yes, I am aware that if anyone from the CLR team is reading this, a ninja team will be dispatched to… discuss with me the notion of supported operations vs. unsupported operation.