I opened up a section in the Wiki for Rhino Commons, mostly just descirbing what it is at the moment
Joel wrote about Ruby's performance, and DHH replied with a post showing how he outsourcedhe performance-intensive functions. To note, my only experiance in Ruby is writing very few Watir tests. So I can't really say anything about Ruby's perfomance first hand. I agree with DHH that this is a good thing, but I wonder about how to handle this in situations where the performance critical part is something that is core to the business logic.
I'm not talking about general stuff like image resizing, encryption or bayesian filtering (which I think you are crazy if you are writing your own for production). What I am talking about is that an application that is not mostly data entry and pre-calculated reports (a good example of which is a bug tracking system).
Let us assume a package delivery application, which let the user choose the route that they can send their packages, of the top of my head, you need to calculate cost (time & money) of moving the package each route while taking into account service level agreements, legal responsabilities, contract issues, past history, validity dates, etc. This get complex very fast, and the amount of data that you need to consider is fairly big, especially if you need to consider business rules like (if customer send more than 10 package a month for the last 3 months, give 4% off, etc).
You can do this on the backend, to pre-calculate the most common ( or all ) routes and their costs, but it may very well be that you simply have too much parameters to do this pre-calculation (or are prevent for business reasons).
Assuming that I had a web application in Ruby on Rails, and I wanted to make the choose a route page work, how would I go about building it? This is mainly a CPU bound task, with a limited amount of data to fetch and process, but I can't easily drop down to C for this task. This is a task that involve quite a bit of business logic (just finding out if a contract is valid or not may be a complex process, for instance), which I would have to duplicate in C (I may be able to hand the data to the C program from Ruby in a usable form, but I doubt it) in order to gain the neccecary performance.
So, given this scenario (and, of course, assuming that doing this in Ruby is not performant enough), what are the options that I have?
Take a guess, what is going to be the result of the following code?
Enum one = DayOfWeek.Sunday;
Enum two = DayOfWeek.Sunday;
Assert.IsTrue(one == two);
Update: Tomas Restrepo, hit the nail on the head in the comments.
System.Enum is a reference type. The first line actually translates to:
There is boxing done here, and then the == is doing reference equality. I actually learned about this from Tomas, a while ago, when discussing about wierd-ass questions.
This is a freakish issue because Enum inherits from ValueType, so you would expect it to keep value types semantics. Check the comments for Tomas' for explanation
After last night's post about the performance benefits of SqlCommandSet, I decided to give the ADO.Net team some headache, and release the results in a reusable form.
The relevant code can be found here, as part of Rhino Commons. Beside exposing the batching functionality, it is very elegant (if I say so myself) way of exposing functionality that the original author decided to mark private / internal.
I really liked the declaration of this as well:
[
ThereBeDragons("Not supported by Microsoft, but has major performance boost")]public class SqlCommandSet : IDisposable
The usage is very simple:
SqlCommandSet commandSet = new SqlCommandSet();
commandSet.Connection = connection;
for (int i = 0; i < iterations; i++)
{
SqlCommand cmd = CreateCommand(connection);
commandSet.Append(cmd);
}
int totalRowCount = commandSet.ExecuteNonQuery();
As a note, I spiked a little test of adding this capability to NHibernate, and it seems to be mostly working, I got 4 (out of 694) test failing because of this. I didn't check performance yet.
I have ranted before about the annoying trend from Microsoft, to weld the hood shut in most of the interesting places. One particulary painful piece is the command batching implementation in .Net 2.0 for SQL Server. The is extremely annoying mainly because the implementation benefits are going for those who are going to be using DataSets (ahem, not me), but are not avialable to anyone outside of Microsoft. (See topic: OR/M, NHibernate, etc).
Today, I have decided to actually check what the performance difference are all about. In order to do this, I opened the (wonderful, amazing) Reflector and started digging. To my surprise, I found that the Batching implementation seems to be centralized around a single class, System.Data.SqlClient.SqlCommandSet (which is internal, of course, to prevent it from being, you know, useful).
Since the class, and all its methods, are internal to System.Data, I had to use Reflection to pry them out into the open. I noticed that the cost of reflection was fairly high, so I converted the test to use delegates, which significantly imporved perfromance. The query I run was a very simple query:
INSERT
INTO [Test].[dbo].[Blogs] ([blog_name]) VALUES (@name)With the @name = 'foo' as the parameter value. The table is simple Id (identity), Blog_Name (nvarchar(50))
Note: Before each test, I truncated the table, to make sure it is not the additional data that is causing any slowdown.
The Results:

The X axis is the number of inserts made, the Y axis is the number of ticks that the operation took. As you can see, there is quite a performance difference, even for small batch sizes. There is a significant difference between batching and not batching, and that reflection / delegates calls are not a big cost in this scenario.
Here is the cost of a smaller batch:

This shows a significant improvement even for a more real-world loads, even when we use Reflection.
I just may take advantage of this to implement a BatchingBatcher for NHibernate, it looks like it can make a good benefit for perfromance. Although this will probably not affect SELECT performance, which is usually a bigger issue.
You can get the code here: BatchingPerfTest.txt
I am happy to accounce Rhino Mocks 2.9.1, which came after quite a dry spell...
The changes are:
- Added Message() to Method Options, allowing more structured way to add intent to expectations described in more details here.
- Added operator && and || overloading to constraints
- Added Is.Matching<T>(Predicate<T> pred) constraint
- Better error message for trying to mock a non-virtual method call.
- Fixed stupid issue with BackToRecordAll()
- Removed outdated documentation
I'm not sure if how to explain it, but I actuall go all of this between 04:30 to 05:45 (in the morning).
I also moved all the documentation to the wiki, and at the moment, the greatest help of all would be to add stuff to it.
As usual, the changes can be found here
I got this page when I looked at the logs of an application that was running for about four hours...

Somehow I don't think that the logs will be very useful in this form...
- An AppDomain may hold zero or more instances of HttpApplication
- A new instance of HttpApplication may be created (for reasons that I have not been able to figure out) - a set of all the http modules will be created as well.
- An instnace of HttpApplication may be disposed (with its associted http modules) at any time.
- A request will always be served by a fully initialized HttpApplication.
Or, more correctly, you can't rely on unsetting a static varaible in Dispose() and setting it in Init() (or Application_Start and Application_End), since they may be called several times.
This mean that you either have to do a ref counting ( Bad ), or implement application level variables in Web secanrios. (Or just forgo about cleaning resources when the HttpApplication is going down).
I chose the first way, but it is not very clean, in my opion, check out the changes that I had to make:
Basically, this means that when running under Web context, it will use the current application as a global variable, and when running under non-web context, it will use a static variable.

This is what an interface looks like after about two week of work and seventeen refactoring. At one point, it looked like this:
interface IHaveMultipleOccurrences<T> : IHaveNameAndId, IHaveValidityRange<T> where T : IHaveValidityRange<T>
But the fear of the compiler puking on me kept the design a little simpler.
<httpModules>
<add name="UnitOfWorkModule"
type="Rhino.Commons.HttpModules.UnitOfWorkModule,
Rhino.Commons"/>
</httpModules>

But, if I put a break point on the Init() method in the Http Module, it is called twice! I run it many times, and the "secret" seems to be multiply requests in the application's start. It looks likes several HttpApplication instances are created, and this is what is causing the issue.
This may be related to the previous error, since my state is app-domain global. If the ASP.Net runtime initiate more than a single HttpApplication per app-domain, and clean it up after ward, it may cause the issues that I have seen, of Disposing an HttpModule which hold global state and then not calling the Init() again.
At the moment, I am going to assume that this is the case, which means that my life are ever more complicated than before.
At the moment, I am using a static variable to hold the container instance, but this static variable is shared among all the HttpApplication instances. This probably mean that I need to keep track of it in an Application level variable, but this gets into complicated code in non-web scenarios.
