﻿<?xml version="1.0" encoding="utf-8"?><rss version="2.0"><channel><title>Ayende @ Rahien</title><link>http://ayende.com</link><description>Ayende @ Rahien</description><copyright>Copyright (C) Ayende Rahien  2004 - 2021 (c) 2026</copyright><ttl>60</ttl><item><title>configurator commented on When mini benchmarks are important</title><description>Ayende, May we know which hash function you're using now that you changed it?
</description><link>http://ayende.com/4344/when-mini-benchmarks-are-important#comment29</link><guid>http://ayende.com/4344/when-mini-benchmarks-are-important#comment29</guid><pubDate>Fri, 01 Jan 2010 00:48:38 GMT</pubDate></item><item><title>Ayende Rahien commented on When mini benchmarks are important</title><description>Jose,
  
There are reasons to violate this rule, in this case, ByteBuffer is a Fly Weight, which I use to reduce memory consumption
</description><link>http://ayende.com/4344/when-mini-benchmarks-are-important#comment28</link><guid>http://ayende.com/4344/when-mini-benchmarks-are-important#comment28</guid><pubDate>Thu, 31 Dec 2009 14:08:46 GMT</pubDate></item><item><title>Patrick Smacchia commented on When mini benchmarks are important</title><description>On a side note Dennis, to avoid dead code, instead of 
  
  
case 6: YYY
  
case 7: XXX
  
default: return false; //dead code
  
  
I usually do
  
  
case 6: YYY
  
default:
  
  Debug.Assert(length == 7);
  
  XXX
  
  break;
</description><link>http://ayende.com/4344/when-mini-benchmarks-are-important#comment27</link><guid>http://ayende.com/4344/when-mini-benchmarks-are-important#comment27</guid><pubDate>Thu, 31 Dec 2009 13:42:29 GMT</pubDate></item><item><title>Jos&amp;#233; Romaniello commented on When mini benchmarks are important</title><description>Sorry, but  the hashcode should be constant for the live of the object. This is a very basic guideline. Am I missing something?
</description><link>http://ayende.com/4344/when-mini-benchmarks-are-important#comment26</link><guid>http://ayende.com/4344/when-mini-benchmarks-are-important#comment26</guid><pubDate>Thu, 31 Dec 2009 11:28:30 GMT</pubDate></item><item><title>Dennis commented on When mini benchmarks are important</title><description>I'm not sure how smart the JIT is...
  
If it is really intelligent, it can use the repl cmpb instruction in this case, which would make it pretty much as fast as your memory is.
  
But even without that, the 32 byte version will allow the CPU to fill all the ALUs (assuming 4 ALUs with 2op/clock) with useful stuff to do. The 8-version has some overhead in loop accounting.
  
For my benchmarking the 32byte one is about 20% faster in the TRUE case for 38 values, your luck may vary depending on your input. But I would say that in general for longer strings the 32byte will win.
  
And the x64 version is alot faster than the x32 version (for obvious reasons, but may also be since the x64 JIT is optimized for performance rather than quick compilation).
  
Using int*s instead of long*s on the 32bit is not any faster.
</description><link>http://ayende.com/4344/when-mini-benchmarks-are-important#comment25</link><guid>http://ayende.com/4344/when-mini-benchmarks-are-important#comment25</guid><pubDate>Thu, 31 Dec 2009 07:37:17 GMT</pubDate></item><item><title>Dennis commented on When mini benchmarks are important</title><description>Another quick optimization due to your getHashcode optimization (and since they use the same implementation) is to compare those two if they have a current valid precomputed value. But it is only useful if the common case is that they are NOT equal.
</description><link>http://ayende.com/4344/when-mini-benchmarks-are-important#comment24</link><guid>http://ayende.com/4344/when-mini-benchmarks-are-important#comment24</guid><pubDate>Thu, 31 Dec 2009 07:29:34 GMT</pubDate></item><item><title>Ayende Rahien commented on When mini benchmarks are important</title><description>Dennis,
  
That is a nice impl, although I 'll admit it took me a moment to understand it.
  
Wouldn't it be just as fast to do?
  
  
[ReliabilityContract(Consistency.WillNotCorruptState, Cer.MayFail)]
  
private static unsafe bool UnSafeArrayEquals(byte[] strA, byte[] strB)
  
{
  
	int length = strA.Length;
  
	if (length != strB.Length)
  
	   return false;
  
  
	fixed (byte* str = strA, str2 = strB)
  
	{
  
	   long* chPtr3 = (long*)str;
  
	   long* chPtr4 = (long*)str2;
  
	   while (length &gt;= 8)
  
	   {
  
		  if (*chPtr3++ != *chPtr4++)
  
		  {
  
			 return false;
  
		  }
  
		  length -= 8;
  
	   }
  
	   switch (length)
  
	   {
  
		  case 0: return true;
  
		  case 1: return *(byte*)chPtr3 != *(byte*)chPtr4;
  
		  case 2: return *(short*)chPtr3 != *(short*)chPtr4;
  
		  case 3: return *(short*)chPtr3 != *(short*)chPtr4 &amp;&amp; *(byte*)(((short*)chPtr3) + 1) != *(byte*)(((short*)chPtr4) + 1);
  
		  case 4: return *(int*)chPtr3 != *(int*)chPtr4;
  
		  case 5: return *(int*)chPtr3 != *(int*)chPtr4 &amp;&amp; *(byte*)(((int*)chPtr3) + 1) != *(byte*)(((int*)chPtr4) + 1);
  
		  case 6: return *(int*)chPtr3 != *(int*)chPtr4 &amp;&amp; *(short*)(((int*)chPtr3) + 1) != *(short*)(((int*)chPtr4) + 1);
  
		  case 7: return *(int*)chPtr3 != *(int*)chPtr4 &amp;&amp; *(short*)(((int*)chPtr3) + 1) != *(short*)(((int*)chPtr4) + 1) &amp;&amp; *(((byte*)chPtr3) + 6) != *(((byte*)chPtr4) + 6);
  
		  default: return false; //dead code
  
	   }
  
	}
  
}
</description><link>http://ayende.com/4344/when-mini-benchmarks-are-important#comment23</link><guid>http://ayende.com/4344/when-mini-benchmarks-are-important#comment23</guid><pubDate>Thu, 31 Dec 2009 07:25:52 GMT</pubDate></item><item><title>Dennis commented on When mini benchmarks are important</title><description>In that case you can get it about double as fast with unsafe code (or 5% of total runtime according to your profile):
  
  
      [ReliabilityContract(Consistency.WillNotCorruptState, Cer.MayFail)]
  
      private static unsafe bool UnSafeArrayEquals(byte[] strA, byte[] strB)
  
      {
  
         int length = strA.Length;
  
         if (length != strB.Length)
  
            return false;
  
  
         fixed (byte* str = strA, str2 = strB)
  
         {
  
            long* chPtr3 = (long*)str;
  
            long* chPtr4 = (long*)str2;
  
            while (length &gt;= 32)
  
            {
  
               if ((*chPtr3 != *chPtr4) ||
  
                  (*(chPtr3 + 1) != *(chPtr4 + 1)) ||
  
                  (*(chPtr3 + 2) != *(chPtr4 + 2)) ||
  
                  (*(chPtr3 + 3) != *(chPtr4 + 3)))
  
               {
  
                  return false;
  
               }
  
               chPtr3 += 4;
  
               chPtr4 += 4;
  
               length -= 32;
  
            }
  
            while (length &gt;= 8)
  
            {
  
               if (*chPtr3 != *chPtr4)
  
               {
  
                  return false;
  
               }
  
               chPtr3 += 1;
  
               chPtr4 += 1;
  
               length -= 8;
  
            }
  
            switch (length)
  
            {
  
               case 0: return true;
  
               case 1: return *(byte*)chPtr3 != *(byte*)chPtr4;
  
               case 2: return *(short*)chPtr3 != *(short*)chPtr4;
  
               case 3: return *(short*)chPtr3 != *(short*)chPtr4 &amp;&amp; *(byte*)(((short*)chPtr3) + 1) != *(byte*)(((short*)chPtr4) + 1);
  
               case 4: return *(int*)chPtr3 != *(int*)chPtr4;
  
               case 5: return *(int*)chPtr3 != *(int*)chPtr4 &amp;&amp; *(byte*)(((int*)chPtr3) + 1) != *(byte*)(((int*)chPtr4) + 1);
  
               case 6: return *(int*)chPtr3 != *(int*)chPtr4 &amp;&amp; *(short*)(((int*)chPtr3) + 1) != *(short*)(((int*)chPtr4) + 1);
  
               case 7: return *(int*)chPtr3 != *(int*)chPtr4 &amp;&amp; *(short*)(((int*)chPtr3) + 1) != *(short*)(((int*)chPtr4) + 1) &amp;&amp; *(((byte*)chPtr3) + 6) != *(((byte*)chPtr4) + 6);
  
               default: return false; //dead code
  
            }
  
         }
  
      }
  
</description><link>http://ayende.com/4344/when-mini-benchmarks-are-important#comment22</link><guid>http://ayende.com/4344/when-mini-benchmarks-are-important#comment22</guid><pubDate>Thu, 31 Dec 2009 07:19:36 GMT</pubDate></item><item><title>Ayende Rahien commented on When mini benchmarks are important</title><description>Andrew,
  
Agreed, and it was changed.
</description><link>http://ayende.com/4344/when-mini-benchmarks-are-important#comment21</link><guid>http://ayende.com/4344/when-mini-benchmarks-are-important#comment21</guid><pubDate>Thu, 31 Dec 2009 07:06:08 GMT</pubDate></item><item><title>Ayende Rahien commented on When mini benchmarks are important</title><description>Dennis
  
I ended up doing something very much like yours, and byteBuffer.Buffer is an array of bytes.
  
</description><link>http://ayende.com/4344/when-mini-benchmarks-are-important#comment20</link><guid>http://ayende.com/4344/when-mini-benchmarks-are-important#comment20</guid><pubDate>Thu, 31 Dec 2009 07:05:58 GMT</pubDate></item><item><title>Ayende Rahien commented on When mini benchmarks are important</title><description>Configurator,
  
I implemented something similar, but note that I can't do it in StringEqaulsToBuffer, for the simple reason that the byte buffer has an offset that would kill that optimization
</description><link>http://ayende.com/4344/when-mini-benchmarks-are-important#comment19</link><guid>http://ayende.com/4344/when-mini-benchmarks-are-important#comment19</guid><pubDate>Thu, 31 Dec 2009 06:57:00 GMT</pubDate></item><item><title>Ayende Rahien commented on When mini benchmarks are important</title><description>Greg, Rafal &amp; Daniel,
  
I was able to fix the hash function on both locations, thanks.
</description><link>http://ayende.com/4344/when-mini-benchmarks-are-important#comment18</link><guid>http://ayende.com/4344/when-mini-benchmarks-are-important#comment18</guid><pubDate>Thu, 31 Dec 2009 06:50:44 GMT</pubDate></item><item><title>Dennis commented on When mini benchmarks are important</title><description>private static bool StringEqaulsToBuffer(ByteString byteString, ByteBuffer byteBuffer)
  
{
  
    var strLen = byteString.Length;
  
    if(strLen != byteBuffer.Length)
  
        return false;
  
   var bytestr = byteString.bytes;
  
   var bufferstr = byteBuffer.Buffer;
  
   var offset = byteBuffer.Offset;
  
    for (int i = 0; i &lt; strLen; offset++, i++)
  
    {
  
        if(bytestr[i] != bufferstr[offset])
  
            return false;
  
    }
  
    return true;
  
}
  
  
But how is byteBuffer.Buffer[] implemented? That might allow for further optimization
</description><link>http://ayende.com/4344/when-mini-benchmarks-are-important#comment17</link><guid>http://ayende.com/4344/when-mini-benchmarks-are-important#comment17</guid><pubDate>Thu, 31 Dec 2009 03:44:48 GMT</pubDate></item><item><title>Andrew commented on When mini benchmarks are important</title><description>RE: Custom has function, its probably also worth checking how even the distribution is between buckets in hashtables/dictionaries etc.
  
  
Perhaps there's a performance problem somewhere because the hashcode is too similar for many strings etc.
</description><link>http://ayende.com/4344/when-mini-benchmarks-are-important#comment16</link><guid>http://ayende.com/4344/when-mini-benchmarks-are-important#comment16</guid><pubDate>Thu, 31 Dec 2009 03:18:41 GMT</pubDate></item><item><title>configurator commented on When mini benchmarks are important</title><description>@Alex: This example is the actual implementation of string.GetHashCode(), so I'm not the person to ask questions about it. I will note though that the enumeration is not reverse; only 'i' is reversed, and that is to prevent accessing the Length property multiple times. The pointer is incremented in each step though, so the enumeration is actually forward.
  
  
Note that this is an implementation for a char array. A similar one for bytes (This should be checked before using it, it's notepad code):
  
  
public override unsafe int GetHashCode()
  
{
  
fixed (byte* str = ((byte*) Buffer))
  
{
  
byte* bytePtr = str;
  
int num = 0x15051505;
  
int num2 = num;
  
int* numPtr = (int*) bytePtr;
  
for (int i = Buffer.Length; i &gt; 0; i -= 8)
  
{
  
num = (((num &lt;&lt; 5) + num) + (num &gt;&gt; 0x1b)) ^ numPtr[0];
  
if (i &lt;= 4)
  
{
  
break;
  
}
  
num2 = (((num2 &lt;&lt; 5) + num2) + (num2 &gt;&gt; 0x1b)) ^ numPtr[1];
  
numPtr += 2;
  
}
  
return (num + (num2 * 0x5d588b65));
  
}
  
  
What's changed? 'char' changed to 'byte', 'i -= 4' changed to 'i -= 8' (because a char is two bytes), and '(i &lt;= 2)' changed to '(i &lt;= 4)'. I think that's pretty much it and this should work for byte arrays.
</description><link>http://ayende.com/4344/when-mini-benchmarks-are-important#comment15</link><guid>http://ayende.com/4344/when-mini-benchmarks-are-important#comment15</guid><pubDate>Wed, 30 Dec 2009 22:22:13 GMT</pubDate></item><item><title>tobi commented on When mini benchmarks are important</title><description>Your hash function is really bad because it only includes the last 4 bytes. If that is enough for you you can optimize this method to be a lot faster^^
</description><link>http://ayende.com/4344/when-mini-benchmarks-are-important#comment14</link><guid>http://ayende.com/4344/when-mini-benchmarks-are-important#comment14</guid><pubDate>Wed, 30 Dec 2009 20:20:30 GMT</pubDate></item><item><title>Alex Yakunin commented on When mini benchmarks are important</title><description>Hmm... Just noticed he wrote GetHashCode example - so it's not clear why he uses reverse enumeration in this case. But what I wrote above is still intact for comparison.
</description><link>http://ayende.com/4344/when-mini-benchmarks-are-important#comment13</link><guid>http://ayende.com/4344/when-mini-benchmarks-are-important#comment13</guid><pubDate>Wed, 30 Dec 2009 20:18:31 GMT</pubDate></item><item><title>Alex Yakunin commented on When mini benchmarks are important</title><description>Nice you started to care about such benchmarks ;)
  
  
For exactly this case the optimization suggested by configurator must be ideal:
  
- Strings are more frequently differ in the end, so better to compare them in reverse order. If you'd take a look at built-in string comparers in .NET, you'll find they use this assumption.
  
- It's really better to use pointers accessing integer values, especially if you're going to use above assumption (JIT compiler does not recognize backward array enumeration + ~ it is anyway a bit faster then byte-by-byte comparison).
</description><link>http://ayende.com/4344/when-mini-benchmarks-are-important#comment12</link><guid>http://ayende.com/4344/when-mini-benchmarks-are-important#comment12</guid><pubDate>Wed, 30 Dec 2009 20:15:05 GMT</pubDate></item><item><title>firefly commented on When mini benchmarks are important</title><description>Greg is right, most of the time the JIT compiler is really good at inlining property accessor for a loop. 
  
I also agree with the commenters, C and Assembly programmers are extremely good at bit manipulation so I would study their code to see how they do it. Perhaps do a search on bit masking, bit shifting and bitwise operation will get you to the right place. Something similar to what configurator is doing. Though his is specific so you might want to come up with your own.
</description><link>http://ayende.com/4344/when-mini-benchmarks-are-important#comment11</link><guid>http://ayende.com/4344/when-mini-benchmarks-are-important#comment11</guid><pubDate>Wed, 30 Dec 2009 19:27:39 GMT</pubDate></item><item><title>configurator commented on When mini benchmarks are important</title><description>I agree with Daniel and Rafal. This hash function is horrible and will cost you. Here's a good, fast and tested hash function - it will give different results on 32-bit and 64-bit machines but that doesn't matter. Hope you don't mind it's unsafe.
  
  
public override unsafe int GetHashCode()
  
{
  
    fixed (char* str = ((char*) Buffer))
  
    {
  
        char* chPtr = str;
  
        int num = 0x15051505;
  
        int num2 = num;
  
        int* numPtr = (int*) chPtr;
  
        for (int i = Buffer.Length; i &gt; 0; i -= 4)
  
        {
  
            num = (((num &lt;&lt; 5) + num) + (num &gt;&gt; 0x1b)) ^ numPtr[0];
  
            if (i &lt;= 2)
  
            {
  
                break;
  
            }
  
            num2 = (((num2 &lt;&lt; 5) + num2) + (num2 &gt;&gt; 0x1b)) ^ numPtr[1];
  
            numPtr += 2;
  
        }
  
        return (num + (num2 * 0x5d588b65));
  
    }
  
}
  
  
  
</description><link>http://ayende.com/4344/when-mini-benchmarks-are-important#comment10</link><guid>http://ayende.com/4344/when-mini-benchmarks-are-important#comment10</guid><pubDate>Wed, 30 Dec 2009 16:29:30 GMT</pubDate></item><item><title>configurator commented on When mini benchmarks are important</title><description>The JIT has optimizations for array access in a loop when the loop variable is bound to the array size (no bounds checking). If ByteBuffer.Buffer is actually an array, and if the entire buffer is used commonly (Offset = 0, Length = buffer.Length), you could try:
  
  
private static bool StringEqaulsToBuffer(ByteString byteString, ByteBuffer byteBuffer)
  
{
  
    var buffer = byteBuffer.Buffer;
  
    if(byteString.Length != buffer.Length)
  
        return false;
  
    for (int i = 0; i &lt; buffer.Length; i++)
  
    {
  
        if(byteString.bytes[i] != buffer[i])
  
            return false;
  
    }
  
    return true;
  
}
  
  
You'd have to make the Offset-using version a special case though, so if you use it often this will be no optimization at all...
</description><link>http://ayende.com/4344/when-mini-benchmarks-are-important#comment9</link><guid>http://ayende.com/4344/when-mini-benchmarks-are-important#comment9</guid><pubDate>Wed, 30 Dec 2009 16:20:11 GMT</pubDate></item><item><title>Rafal commented on When mini benchmarks are important</title><description>Bad hash function will make hashtable operations more costly, so probably its better to fix the function than try to run the bad version faster
</description><link>http://ayende.com/4344/when-mini-benchmarks-are-important#comment8</link><guid>http://ayende.com/4344/when-mini-benchmarks-are-important#comment8</guid><pubDate>Wed, 30 Dec 2009 15:51:44 GMT</pubDate></item><item><title>Daniel commented on When mini benchmarks are important</title><description>If you have to use that stupid hash function, surely you can implement it by looking only at the last 4 characters? That would save that loop which can get expensive for long strings.
  
  
In fact, if we're talking x86 assembly, GetHashCode() for Length&gt;=4 can be implemented in a few instructions
  
// given EBX=Buffer+Offset and ECX=Length
  
MOV EAX, EBX[ECX-4]
  
BSWAP EAX // convert little endian to big endian
  
RET
  
  
I don't know if there's a way to make the .NET JIT output something like that, I would try something like
  
public override int GetHashCode()
  
{
  
    if (Length &gt;= 4) {
  
      return IPAddress.HostToNetworkOrder(BitConverter.ToInt32(Buffer, Offset + Length - 4));
  
    } else {
  
       // your code
  
   }
  
}
</description><link>http://ayende.com/4344/when-mini-benchmarks-are-important#comment7</link><guid>http://ayende.com/4344/when-mini-benchmarks-are-important#comment7</guid><pubDate>Wed, 30 Dec 2009 14:29:29 GMT</pubDate></item><item><title>Ayende Rahien commented on When mini benchmarks are important</title><description>Greg
  
I haven't considered the effect of running under the profiler, you are right. But the perf boost was noticable even without the profiler
  
you are correct about the hash func as well, in my defence, I am matching the behaviour of a third party
</description><link>http://ayende.com/4344/when-mini-benchmarks-are-important#comment6</link><guid>http://ayende.com/4344/when-mini-benchmarks-are-important#comment6</guid><pubDate>Wed, 30 Dec 2009 12:35:55 GMT</pubDate></item><item><title>Ayende Rahien commented on When mini benchmarks are important</title><description>Andrews
  
yes I fond that ot after I did the profiling and dht want to redo it
  
and I recalc immediately
</description><link>http://ayende.com/4344/when-mini-benchmarks-are-important#comment5</link><guid>http://ayende.com/4344/when-mini-benchmarks-are-important#comment5</guid><pubDate>Wed, 30 Dec 2009 12:27:59 GMT</pubDate></item><item><title>Patrick Smacchia commented on When mini benchmarks are important</title><description>Some others tips on Micro optimization I found btw: 
  
  
[codebetter.com/.../...to-increase-performance.aspx](http://codebetter.com/blogs/patricksmacchia/archive/2009/04/19/micro-optimization-tips-to-increase-performance.aspx)</description><link>http://ayende.com/4344/when-mini-benchmarks-are-important#comment4</link><guid>http://ayende.com/4344/when-mini-benchmarks-are-important#comment4</guid><pubDate>Wed, 30 Dec 2009 11:51:20 GMT</pubDate></item><item><title>Patrick Smacchia commented on When mini benchmarks are important</title><description>Micro-optimizations are needed I think:
  
[codebetter.com/.../...need-micro-optimization.aspx](http://codebetter.com/blogs/patricksmacchia/archive/2009/04/19/do-we-need-micro-optimization.aspx)  
  
I wrote also concerning optimization on for loop:
  
[codebetter.com/.../...e-net-code-performances.aspx](http://codebetter.com/blogs/patricksmacchia/archive/2008/11/19/an-easy-and-efficient-way-to-improve-net-code-performances.aspx)  
  
Btw, you can still have a win on the first example by creating a temporary reference to byteBuffer.Buffer (if it is a reference type):
  
  
    for (int i = 0; i &lt; strLen; i++) {
  
        if(byteString.bytes[i] != ********byteBuffer.Buffer*********[byteBuffer.Offset+i])
  
            return false;
  
    }
  
  
</description><link>http://ayende.com/4344/when-mini-benchmarks-are-important#comment3</link><guid>http://ayende.com/4344/when-mini-benchmarks-are-important#comment3</guid><pubDate>Wed, 30 Dec 2009 11:50:00 GMT</pubDate></item><item><title>Greg Beech commented on When mini benchmarks are important</title><description>Are you sure the first problem with get_Length/get_Item isn't caused by running the code under the profiler (a Heisenbug, in effect)? In optimised code running normally I'd expect to see the property accessor methods inlined by the CLR. Often profilers disable inlining to get more 'accurate' results which can lead to false positives like this when the method call overhead becomes significant; you may want to check that your profiler settings and ensure that it was not disabling method inlining.
  
  
In addition, your GetHashCode function doesn't appear to provide a true hash. As you're simply left-shifting all the previous data by 8, you're actually just discarding all but the last four bytes of data. I'd consider replacing the aggregation term with a multiply/xor combination, e.g. "(ret * 23) ^ Buffer[i]", which will give you a true hash.
</description><link>http://ayende.com/4344/when-mini-benchmarks-are-important#comment2</link><guid>http://ayende.com/4344/when-mini-benchmarks-are-important#comment2</guid><pubDate>Wed, 30 Dec 2009 11:32:26 GMT</pubDate></item><item><title>Andrew commented on When mini benchmarks are important</title><description>(Nitpick: Equals is spelled wrong in StringEqualsToBuffer)
  
  
With ResetHash, are you recalculating the hash immediately, or simply clearing the hash value, allowing it to be calculated again when needed, lazily?
  
  
These posts are interesting, thanks for making them!
</description><link>http://ayende.com/4344/when-mini-benchmarks-are-important#comment1</link><guid>http://ayende.com/4344/when-mini-benchmarks-are-important#comment1</guid><pubDate>Wed, 30 Dec 2009 10:09:40 GMT</pubDate></item></channel></rss>