Ayende @ Rahien

It's a girl

Relying on hash code implementation is BAD – part II

To be truthful, I never thought that I would have a following for this post 4 years later, but I run into that today.

The following is a part of an integration test for NH Prof:

Assert.AreEqual(47, alerts[new StatementAlert(new NHProfDispatcher())
{
	Title = "SELECT N+1"
}]);

I am reviewing all our tests now, and I nearly choked on that one. I mean, who was stupid enough to write code like this?  I mean, yes, I can understand what it is doing, sort of, but only because I have a dawning sense of horror when looking at it.

I immediately decided that the miscreant that wrote that piece of code should be publically humiliated and  chewed on by a large dog.

SVN Blame is a wonderful thing, isn’t it?

image

Hm… there is a problem here.

Actually, there are a couple of problems here. One is that we have a pretty clear indication that we have a historic artifact here. Just look at the number of version that are shown in just this small blame window. This is good enough reason to start doing full fledged ancestory inspection. The test has started life as:

[TestFixture]
public class AggregatedAlerts:IntegrationTestBase
{
	[Test]
	public void Can_get_aggregated_alerts_from_model()
	{
		ExecuteScenarioInDifferentAppDomain<Scenarios.ExecutingTooManyQueries>();

		var alerts = observer.Model.Sessions[1].AggregatedAlerts;
		Assert.AreEqual(47, alerts["SELECT N+1"]);
		Assert.AreEqual(21, alerts["Too many database calls per session"]);
	}
}

Which I think is reasonable enough. Unfortunately, it looks like somewhere along the way, someone had taken the big hammer approach to this. The code now looks like this:

Assert.AreEqual(47, alerts.First(x => x.Key.Title == "SELECT N+1"));

Now this is readable.

Oh, for the nitpickers, using hash code evaluation as the basis of any sort of logic is wrong. That is the point of this post. It is a non obvious side affect that will byte* you in the ass.

* intentional misspelling

Comments

a
04/27/2009 03:14 PM by
a

btw, this is not relying on a particular hash code but on the fact that gethc and equals are properly implemented, so that:

a.eq(b) => a.gethc() == b.gethc()

a.gethc() != b.gethc() => !a.eq(b)

and that eq and gethc are taking into account all relevant fields. however an "action" kind of object (without knowing exactly what your statementaction does) is probably not the right kind of hash code provider. gethc should only be implemented for value object imo. i also think it is a mistake in the framework design that every object has equals and gehashcode methods. that is clearly something that not every object can and should support.

Ayende Rahien
04/27/2009 03:19 PM by
Ayende Rahien

a,

Not really.

If you don't have that, you can't write hash table or dictionary.

Bill Pierce
04/27/2009 08:07 PM by
Bill Pierce

It is not obvious from your posts, excluding the readability of the code, what the non obvious side affects are of using GetHashCode in this way. Can you enlighten me?

Ayende Rahien
04/28/2009 04:31 AM by
Ayende Rahien

Bill,

My main concerns arereadability and sensability

Nik
04/28/2009 08:50 AM by
Nik

Agreed, it doesn't read well that alerts["SELECT N+1"] means get the alert with the title "SELECT N+1" especially with the one below it "alerts["Too many database calls per session"]" which looks like your using the [] to set a message not select a title.

Though in my eyes, it's more of a case of bad titling. :P But then using more code like titles to reference a full alert would probably read just as cryptic.

James
04/28/2009 12:16 PM by
James

String.GetHashCode() returns different values when running in a 32-bit or 64-bit image.

Now, if you serialized the value somewhere in a file format...

Yes, it happened. No comments please ;)

Ayende Rahien
04/28/2009 05:05 PM by
Ayende Rahien

James,

That is actually something that you should be worried about in a mixed 32 & 64 bits env.

Most distributed caches does the naive thing and use GetHashCode to get the hashcode for the key lookup.

Granted, there aren't that many mixed bits mode, but it is important to know

Comments have been closed on this topic.