Benchmark cheating, a how to

time to read 3 min | 534 words

I already talked about the benchmark in general, but I want to focus for a bit on showing how you can carefully craft a benchmark to say exactly what you want.

Case in point, Alex Yakunin has claimed that NHibernate is:

NH runs a simple query fetching 1 instance 100+ times slower when ~ 10K instances are already fetched into the Session

I am assuming that he is basing his numbers of this test:

image

And you know what, he is right. NHibernate will get progressively slower as more items are loaded. This is by design.

Huh? NHibernate is slower by design? WTF?!

Well, it is all related to the way you build the test. And this test appears specifically designed NHibernate looks bad. Let me go back a bit and explain. NHibernate underlying premise is that you live in OO world, and we take care of everything we can in the RDBMS world.  One of the ways we are doing that is by automatically flushing the session if a query is performed that may be affected changes in memory.

Huh? That still doesn’t make sense. It is much simpler to understand in code.

using(tx = s.BeginTransaction())
{
	foo = s.Get<Foo>(1);
	foo.Name = "ayende";
	var list = s.CreateQuery("from Foo f where f.Name = 'ayende'").List<Foo>();
	Assert.Contains(list, foo);
	tx.Commit();
}

The following test will pass. When the times come to execute this query, NHibernate will check all the loaded instance that may be affected by this query, and flush any changes to the database.

NHibernate is smart enough to check only the loaded instances that may be affected, so in general, this is a very quick operation, and it means that you don’t have to worry about the current state of operations. NHibernate manages everything for you.

However, in the case of the QueryTest above, this is using this feature of NHibernate to show that it is slow. All queries are made against the same class, which forces NHibernate to perform a dirty check on every single loaded object. With more objects loaded into memory, that dirty check is going to take longer.

Now, there is no real world scenario where code like that would ever be written (except maybe as a bug), but the result of the “test” are being used to say that NHibernate is slow.

Of course you can show that NHibernate is slow if you build a test specifically for that.

Oh, and a hint, session.FlushMode = FlushMode.Commit; will make NHibernate skip the automatic flush on query, meaning that we will not perform any dirty checks on queries.

But that is not really relevant. Tight loop benchmarks for frameworks as complex as OR/Ms are always going to lie. The only way to really create a benchmark is to create a full blown application with several backend implementations. That is still a bad idea, as there are still plenty of ways to cheat. All you have to do is to look at the PetShop issues from 2002/3 to figure that one out.