Note: This post was written by Federico.
Recently at CoreFX there has been a proposal to deal with the typical case of everyone writing their own hash combining logic. I really like the framework to push that kind of functionality because hashing is one of those things that unless you really know what you are doing, it can backfire pretty bad (and even if you know, you can step over the cliff too). But this post is not about the issue itself, but it showcases a few little details that happen at JIT land (and that we usually abuse, of course J). This is the issue: https://github.com/dotnet/corefx/issues/8034#issuecomment-260733285
Let me illustrate with some examples.
ValueTuple is a new type that is being introduced that is necessary for some of the new tuples functionality in C# 7. Because hashing is important, of course they implemented the ability to combine hashes. Now, let’s suppose that we take the actual code hashing code that is being used for ValueTuple and use constants to call it.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
internal static class HashHelpers { public static int Combine(int h1, int h2) { // The jit optimizes this to use the ROL instruction on x86 // Related GitHub pull request: dotnet/coreclr#1830 uint shift5 = ((uint)h1 << 5) | ((uint)h1 >> 27); return ((int)shift5 + h1) ^ h2; } } [MethodImpl(MethodImplOptions.NoInlining)] public static int TryStaticCall() { return HashHelpers.Combine(10202, 2003); } [MethodImpl(MethodImplOptions.NoInlining)] public static int TryValueTuple() { return ValueTuple.Create(10202, 2003).GetHashCode(); }
Now under an optimizing compiler chances are that there shouldn't be any difference, but in reality there is.
This is the actual machine code for ValueTuple:
So now what can be seen here? First we are creating an struct in the stack, then we are calling the actual hash code.
Now compare it with the use of HashHelper.Combine which for all purposes it could be the actual implementation of Hash.Combine
I know!!! How cool is that???
But let’s not stop there... let’s use actual parameters:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
[MethodImpl(MethodImplOptions.NoInlining)] public static int TryStaticCall(int h1, int h2) { return HashHelpers.Combine(h1, h2); } [MethodImpl(MethodImplOptions.NoInlining)] public static int TryValueTuple(int h1, int h2) { return ValueTuple.Create(h1, h2).GetHashCode(); } static unsafe void Main(string[] args) { var g = new Random(); int h1 = g.Next(); int h2 = g.Next(); Console.WriteLine(TryStaticCall(h1, h2)); Console.WriteLine(TryValueTuple(h1, h2)); }
We use random values here to force the JIT to treat them as parameters and not being able to eliminate the code and convert it yet again into a constant.
The good thing, this is extremely stable. But let’s compare it with the alternative:
The question that you may be asking yourselves is: Does that scale?. So, let’s go overboard...
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
internal static class HashHelpers { public static int Combine(int h1, int h2) { // The jit optimizes this to use the ROL instruction on x86 // Related GitHub pull request: dotnet/coreclr#1830 uint shift5 = ((uint)h1 << 5) | ((uint)h1 >> 27); return ((int)shift5 + h1) ^ h2; } public static int Combine(int h1, int h2, int h3, int h4) { return Combine(Combine(h1, h2), Combine(h3, h4)); } } [MethodImpl(MethodImplOptions.NoInlining)] public static int TryStaticCall(int h1, int h2, int h3, int h4) { return HashHelpers.Combine(h1, h2, h3, h4); }
And the result is pretty illustrative:
The takeaway of the analysis is simple: even if the holding type is a struct, it doesn't mean it is free :)