Working with high levels tools: A Performance Perspective

time to read 5 min | 974 words

The performance question was raised in the Castle's Forums, hammett has posted a blog entry about it, but I have my own two cents to add. The question was about a performance test made with NHibernate vs. ADO.Net, resulting in NHibernate being quite a bit slower.

Just to note, at the moment I'm talking about performance in terms of milli seconds, I'll talk about performance in terms of days in a bit.

A test between ADO.Net and NHibernate is going to be meaningless for the most part. Sure, you can issue a SELECT to get all the data in ADO.Net which will always be faster than NHibernate (which eventually issue the same SELECT, using the same ADO.Net provider, etc), but that is not a good way to test the performance.

But let us start doing things that are a little more real life. Let us display a page that require data from several tables, including a three tables complex join. Using NHibernate, I just fetch the data, and I am done. Using ADO.Net, I need to actually build the queries, either as inline strings or as stored procs. Then make the call with all the messy paramaters, etc. This is not going to fun, I assure you.

NHibernate.dll is currently 92,679 Lines of Code, and not for nothing. NHibernate provides a large amount of services that build on top of the persistance engine alone. There is a lot going on there that is responsible for outputing the optimal SQL statement for each scenario, for avoiding loading duplicate information, etc.

In a typical application, the amount of different queries can get big very quickly, are you going to put the same amount of expertise on each of those? How do you handle variable queries (think search pages with several conditions)?

Now let us talk about performance in days and weeks.

But still, you have a SQL guru in your team, and you feel confident that you can beat NHibernate's engine any time of day. That will only hold water until I throw a couple hundred users against your application. The fastest way to get information from the database is not to go to the database in the first place.

Enabling caching in NHibernate is a matter of adding a line to the configuration file. How long is it going to build a caching system and integrate it to your application? (Even assuming that you are using a pre-built system, like ASP.Net Cache or the Caching Application Block, you still need to make the calls in all the right places.) Can your caching scale to a web farm scenario?

Have you thought about concurrency? Transactions? Paging? Thread safety? Lazy loading?

I was asked to estimate how much it would cost to build an ORM, the base line features are list here, the estimate was three months for a team of three people for an alpha release. That is nine months slower. And yes, you are going to need those features.

And now let us talk about the resulting code.

Pure ADO.Net code is ugly, repeating and very easy to get wrong. Very few applications use the pure ADO.Net approach, they often wrap it in something else, either their own framework or something like the Data Access Block. Either way, you are working directly against the data model in the database, which is usually different than the object model in the application. Doing the translation manually is not fun, even for the simple scenario, and it means that you have to keep several models in your head as your work.

Using NHibernate, you work at a much higher level of abstraction, which gives you the chance to do things completely different. For instnace, I have a piece of code that needs to select a list of employees based on some variables, those variables include some bit twiddling (don't ask) and business logic. I'm using NHibernate to select the employees, and then filter then rest in the business layers. The code is mere 10 lines of code that are doing some pretty complex work.

Conclustions:

Hand crafted assembly is always going to be faster than compiler generated assembly!

Or is it? The above statement was proven false a long time ago. Going up the abstraction chain gives you better options for optimizations, not less. But, and this is important, in order to see this, you need to view if from a high level perspective.

I'm pretty sure that there is someone that is capable of writing a for loop in assembly that is better than the one a compiler will generate (only for that specific scenario, otherwise it is not a good compiler :-) ), but that comparision is useless. You need to look at the whole program to see the performance, not just as a single for loop. In fact, Whole Program Optimization is a big thing in C++, and produce some fantastic results.

The same is true for the comparison between ADO.Net to NHibernate. Take a look at the overall scenario, and you'll see that it usually perform as well or better than hand crafted ADO.Net code.