High performance field clobbering

Nov 03 2016

High performance field clobbering

time to read 3 min | 562 words

So we are using a particular library in a not so standard way. And in order to gain 10x performance benefit we have to reuse a particular class from this library. This class isn’t meant to be reused, but looking at its code, it is clear that it is perfectly possible to do so. All we need is to set the _started field to false and it will be possible to reuse this instance.

So far, so good. Except that the field is private. Now, we can’t just implement our own copy of this, because this is a field in the base class that we are extending to plug our extension to the system. We could try submitting a patch for this, but this is a popular library, and tying ourselves to a particular version would suck. This code has also hasn’t changed since at Jan 2012, so that is pretty stable. And yes, we are aware of the risk in doing this, unsupported, etc.

Now that we decided to do it, the question is how. I created the following epic class:

And here is the simplest option to handle it:

And that gives us:

Can we do better?

What happens if we cache the field lookup?

This has significant improvement, right?

But that is still quite high for me. Can we do better still?

Let us try some dynamic code generation. In this case, we can’t use the much easier Expression class to do it, and have to go with direct IL generation, which gives:

And the benchmark result?

That is pretty awesome. For comparison purposes, I also did a static delegate and direct set, to compare the costs.

And those give me:

But I think that 2.5 ns is fast enough for me here Smile .

Tweet Share Share 7 comments

Tags:

Comments

03 Nov 2016
11:12 AM

Uri

Is the famous library named Lucene? if so, I know there are some efforts to get 4.8 out. Thanks a lot for sharing, this is very useful.

03 Nov 2016
18:48 PM

Christophe

Is there a performance advantage in using DynamicMethod, vs building a simple Expression tree and generating the lambda via Expression.Lamda<Action<PrivateRyan, bool>>(...).Compile() ? (not talking about the initial cost of generating the thing, but calling the resulting lambda at runtime)

I usually use Expression Trees for dynamic code, because they are simpler to maintain than having to understand the equivalent IL code.

04 Nov 2016
12:48 PM

Oren Eini

Christophe, You can do that, but it has limits, in particular, you typically cannot avoid visibility, and just setting a field cannot be done in an expression.

04 Nov 2016
14:47 PM

Christophe

Are you sure? I use them all the time to access non-public fields, methods, ctors, and even to deserialize structs/classes with auto-properties where I need to access the private backing fields.

For example, here is one of my helper methods, specifically to build "setters" for Fields (ignore the fact that the action takes object for the instance and params, it would work the same for typed args)

[CanBeNull]
public static Action<object, object> CompileSetter([NotNull] this FieldInfo field)
{
    Contract.NotNull(field, nameof(field));
    if (field.IsInitOnly) return null; // readonly
    
    // "void func(object instance, object value) { ((instance.TYPE)instance).Field = (value.TYPE)value); }"
    var prmInstance = Expression.Parameter(typeof(object), "instance");
    var prmValue = Expression.Parameter(typeof(object), "value");
    
    var body = Expression.Assign(Expression.Field(prmInstance.CastFromObject(field.DeclaringType), field), prmValue.CastFromObject(field.FieldType));
    return Expression.Lambda<Action<object, object>>(body, prmInstance, prmValue).Compile();
}
        
public static Expression CastFromObject([NotNull] this Expression expr, [NotNull] Type targetType)
{
    return expr.Type == targetType ? expr : targetType.IsClass ? Expression.TypeAs(expr, targetType) : targetType.IsValueType ? Expression.Unbox(expr, targetType) : Expression.Convert(expr, targetType);
}

06 Nov 2016
11:03 AM

Oren Eini

Christophe, You are correct, thanks! Funnily enough, your version is faster:

Method	Median	StdDev
CodeGeneration	2.4276 ns	0.0977 ns
CodeGeneration2	1.2444 ns	0.0786 ns

Note quite sure why, thought

06 Nov 2016
11:13 AM

Christophe

Indeed, that is weird, because I thought that Compile() was just using an ExpressionTree Visitor that would Emit(..) the appropriate IL ? Maybe the JIT is doing magic under the covers?

I also tried using using Roslyn for codegen at runtime, because sometimes generating a string with the C# code that does what I want is a bit easier than cobbling together expression trees, especially when you need to build larger methods (like ones that deserialize all the fields of a struct in one go, instead of calling lots of smaller methods for each field). The startup cost of Roslyn is a lot higher, but if you are already using it for other things (like Razor views), it becomes "free". Only downside is that you can't do fancy stuff like calling into private/internal stuff.

01 Dec 2016
20:52 PM

Michael

A bit late, but I can explain why his code is faster. You see Christophe's code is closed over an object (in this case the empty closure object). You can get the same result by creating a dynamic method that closes over an empty object. Delegate invoke are modelled as instance calls, so if you aren't closing over anything it has to thunk shuffle your parameters.

Comment preview

Comments have been closed on this topic.

Oren Eini

Oren Eini

CEO of RavenDB