Ayende @ Rahien

It's a girl

re: Are you smart enough to do without TDD

Daniel has posted a reply to my post, titling it:  Are you smart enough to do without TDD. I more or less expected to get responses like that, which was why I was hesitant to  post it. Contrary to popular opinion, I don’t really enjoy being controversial.

There are two main points that I object to in his post:

You see, Ayende appears to say that if you're smart enough, you'll just know what code to write, just like that. Ergo, if you don't know, maybe you're not that smart and hence you would need this technique for losers called Test Driven Design/Development.

That is not what I said, please don’t put words in my mouth. What I said was: “The idea behind TDD is to use the tests to drive the design. Well, in this case, I don’t have any design to drive.” Combine this with my concepts & features architecture, where the main tenets is: “A feature creation may not involve any design activity.” and it should be clear why TDD simply doesn’t work for my scenario.

And his attack on Rhino Mocks:

Moq vs Rhino Mocks: he [Ayende, it seems] read the (useless IMO) literature on mocks vs stubs vs fakes, had apparently a clear idea of what to do, and came up with Rhino's awkward, user unfriendly and hard to learn API with a myriad of concepts and options, and a record-replay-driven API (ok, I'm sure it was not his original idea, but certainly it's his impl.) which two years ago seemed to him to stand at the core of mocking. Nowadays not only he learned what I've been saying all along, that "dynamic, strict, partial and stub... No one cares", but also is planning to remove the record / playback API too.

This is just full of misinformation. Let me see how:

  • Rhino Mocks is 5 years old.
  • Rhino Mocks came out for .NET 1.0.
  • Rhino Mocks actually predate most of the mocks vs. stubs debate.

I keep Rhino Mocks updated as new concepts and syntax options comes. Yes, AAA is easier, but AAA relies on having the syntax options that we have in C# 3.0. Rhino Mocks didn’t start from there, it started a lot earlier, and it is a testament to its flexibility that I was able to adapt it to any change along the way.

Oh, and Rhino Mocks was developed with TDD, fully. Still is, for that matter. So I find it annoying that someone attacks it on this grounds without really understanding how it worked.

Duct tape programmers

Sometimes I read something and I just know that responding off the cuff would be a big mistake. Joel’s latest essay, duct tape programmers, is one such case.

I many ways, I feel that this and this says it all:

image

Sometimes I feel like Joel is on a quest to eradicate good design practices.

Let us start from where I do agree with him. Yes, some people have a tendency to overcomplicate things and code themselves into a corner. Yes, you should keep an eye on your deadlines and deliver.

But to go from there to disparage good practices? To actually encourage brute force hacking?

I think that Joel’s dream developer is the guy that keep copy/pasting stuff he finds on the web until it looks like it is working. At the very least, it will make sure that his bug tracking system is used.

And the examples that he gives?

Here’s what Zawinski says about Netscape: “It was decisions like not using C++ and not using threads that made us ship the product on time.”

imageOh, wait, let me see. Netscape is the company that:

  • Routinely shipped a browser that kept crashing
  • Wasn’t able to compete with IE
  • Got their source code into a bad enough shape that they had to rewrite it from scratch and lose 5 – 6 YEARS doing so
  • Collapsed

Yep, sounds like this duct tape notion really worked out for them, no?

Here is the deal, every tool and approach can be overused.

But that is part of being a professional, you have to how to balance things. I am not sure what bee got in Joel’s bonnet, but it sure seems to cause him to have a knee jerk reaction whenever good design principles are discussed.

Shipping software is easy, you can cobble together something that sort of works and you have a shipping product. People will even buy it from you. All you have to do is look around and see it.

The hard part is to keep releasing software, and with duct tape, your software will be taken away from you by software protective services.

image

Don’t, just don’t.

Scenario Driven Tests

I originally titled this blog post: Separate the scenario under test from the asserts. I intentionally use the terminology scenario under test, instead of calling it class or method under test.

One of the main problem with unit testing is that we are torn between competing forces. One is the usual drive for abstraction and eradication of duplication, the second is clarity of the test itself. Karl Seguin does a good job covering that conflict.

I am dealing with the issue by the simple expedient of forbidding anything but asserts in the test method. And no, I don’t mean something like BDD, where the code under test is being setup in the constructor or the context initialization method.

I tend to divide my tests code into four distinct parts:

  • Scenario under test
  • Scenario executer
  • Test model, represent the state of the application
  • Test code itself, asserting the result of a specific scenario on the test model

The problem is that a single scenario in the application may very well have multiple things that we want to actually test. Let us take the example of authenticating a user, there are several things that happen during the process of authentication, such as the actual authentication, updating the last login date, resetting bad login attempts, updating usage statistics, etc.

I am going to write the code to test all of those scenarios first, and then discuss the roles of each item in the list. I think it will be clearer to discuss it when you have the code in front of you.

We will start with the scenarios:

public class LoginSuccessfully : IScenario
{
public void Execute(ScenarioContext context)
{
context.Login("my-user","swordfish is a bad password");
}
}

public class TryLoginWithBadPasswordTwice : IScenario
{
public void Execute(ScenarioContext context)
{
context.Login("my-user","bad pass");
context.Login("my-user","bad pass");
}
}

public class TryLoginWithBadPasswordTwiceThenTryWithRealPassword : IScenario
{
public void Execute(ScenarioContext context)
{
context.Login("my-user","bad pass");
context.Login("my-user","bad pass");
context.Login("my-user","swordfish is a bad password");
}
}

And a few tests that would show the common usage:

public class AuthenticationTests : ScenarioTests
{
[Fact]
public void WillUpdateLoginDateOnSuccessfulLogin()
{
ExecuteScenario<LoginSuccessfully>();

Assert.Equal(CurrentTime, model.CurrentUser.LastLogin);
}


[Fact]
public void WillNotUpdateLoginDateOnFailedLogin()
{
ExecuteScenario<TryLoginWithBadPasswordTwice>();

Assert.NotEqual(CurrentTime, model.CurrentUser.LastLogin);
}

[Fact]
public void WillUpdateBadLoginCountOnFailedLogin()
{
ExecuteScenario<TryLoginWithBadPasswordTwice>();

Assert.NotEqual(2, model.CurrentUser.BadLoginCount);
}

[Fact]
public void CanSuccessfullyLoginAfterTwoFailedAttempts()
{
ExecuteScenario<TryLoginWithBadPasswordTwiceThenTryWithRealPassword>();

Assert.True(model.CurrentUser.IsAuthenticated);
}
}

As you can see, each of the tests is pretty short and to the point, there is a clear distinction between what we are testing and what is being tested.

Each scenario represent some action in the system which we want to verify behavior for. Those are usually written with the help of a scenario context (or something of the like) with gives the scenario access to the application services required to perform its work. An alternative to the scenario context is to use a container in the tests and supply the application service implementations from there.

The executer (ExecuteScenario<TScenario>() method) is responsible for setting the environment for the scenario, executing the scenario, and cleaning up afterward. It is also responsible for any updates necessary to get the test model up to date.

The test model represent the state of the application after the scenario was executed. It is meant for the tests to be able to assert against. In many cases, you can use the actual model from the application, but there are cases where you would want to augment that with test specific items, to allow easier testing.

And the tests, well, the tests simple execute a scenario and assert on the result.

By abstracting the execution of a scenario into the executer (which rarely change) and providing an easy way of building scenarios, you can get very rapid feedback into test cycle while maintaining testing at a high level.

Also, relating to my previous post, note what we are testing here isn’t a single class. We are testing the system behavior in a given scenario. Note also that we usually want to assert on various aspects of a single scenario as well (such as in the WillNotUpdateLoginDateOnFailedLogin and WillUpdateBadLoginCountOnFailedLogin tests).

Even tests has got to justify themselves

Let us get a few things out of the way:

  • I am not using TDD.
  • I am not using BDD.
  • I am not using Test After.
  • I am not ignoring testing.

I considered not posting this post, because of the likely response, but it is something that I think it worth at least discussion. The event that made me decide to post this is the following bug:

public bool IsValid
{
get { return string.IsNullOrEmpty(Url); }
}

As you can probably guess, I have an inverted conditional here. The real logic is that the filter is valid if the Url is not empty, not the other way around.

When I found the bug, I briefly considered writing a test for it, but it struck me as a bad decision. This is code that I don’t see any value in testing. It is too stupid to test, because I won’t have any ROI from the tests.  And yes, I am saying that after seeing that the first time I wrote the code it had a bug.

The idea behind TDD is to use the tests to drive the design. Well, in this case, I don’t have any design to drive. In recent years, I have moved away from the tenets of TDD toward a more system oriented testing system.

I don’t care about testing a specific class, I want to test the entire system as whole. I may switch some parts of the infrastructure (for example, change the DB to in memory one), for perf sake, but I usually try to test an entire component at a time.

My components may be as small as a single class or as big as the entire NH Prof sans the actual UI pieces.  I have posted in the past, showing how I implement features for NH Prof, including the full source code for the relevant sections. Please visit the link, it will probably make more sense to you afterward. It is usually faster, easier and more convenient to write a system test than to try to figure out how to write a unit test for the code.

Now, let us look at why people are writing tests:

  • Higher quality code
  • Safe from regressions
  • Drive design

Well, as I said, I really like tests, but my method of designing software is no longer tied to a particular class. I have the design of the class handed to me by a higher authority (the concept), so that is out. Regressions are handled quite nicely using the tests that I do write.

What about the parts when I am doing design, when I am working on a new concept?

Well, there are two problems here:

  • I usually try several things before I settle down on a final design. During this bit of churn, it is going to take longer to do things with tests.
  • After I have a design finalized, it is still easier to write a system level test than write unit tests for the particular implementation.

As a matter of fact, in many cases, I don’t really care about the implementation details of a feature, I just want to know that the feature works. As a good example, let us take a look at this test:

public class CanGetDurationOfQueries : IntegrationTestBase
{
[Fact]
public void QueriesSpecifyTheirDuration()
{
ExecuteScenarioInDifferentAppDomain<SelectBlogByIdUsingCriteria>();
var first = model.RecentStatements
.ExcludeTransactions()
.First();
Assert.NotNull(first.DurationViewModel.Inner.Value);

}
}

NH Prof went through three different ways of measuring the duration of a query. The test didn’t need to change. I have a lot of tests that work in the same manner. Specifying the final intent, rather than specifying each individual step.

There are some parts in which I would use Test First, usually parts that I have high degree of uncertainty about.  The “show rows from query” feature in NH Prof was develop using Test First, because I had absolutely no idea how to approach it.

But most of the time, I have a pretty good idea where I am and where I am going, and writing unit tests for every miniscule change is (for lack of a better phrase) hurting my style.

Just about any feature in NH Prof is covered in tests, and we are confident enough in our test coverage to release on every single commit.

But I think that even a test has got to justify its existence, and in many cases, I see people writing tests that have no real meaning. They duplicate the logic in a single class or method. But that isn’t what I usually care about. I don’t care about what a method or a class does.

I care about what the overall behavior is. And I shaped my tests to allow me to assert just that. I’ll admit that NH Prof is somewhat of a special case, since you have a more or less central location that you can navigate form which to everything else. In most systems, you don’t have something like that.

But the same principle remains, if you setup your test environment so you are testing the system, it is going to be much easier to test the system. It isn’t a circular argument. Let us take a simple example of an online shop and wanting to test the “email on order confirmed” feature.

One way of doing this would be to write a test saying that when the OrderConfirmed message arrive, a SendEmail message is sent. And another to verify that SendEmail message actually send an email.

I would rather write something like this, however:

[Fact]
public void WillSendEmailOnOrderConfirmation()
{
// setup the system using an in memory bus
// load all endpoints and activate them
// execute the given scenario
ExecuteScenario<BuyProductUsingValidCreditCard>();

var confirmation = model.EmailSender.EmailsToSend.FirstOrDefault(x=>x.Subject.Contains("Order Confirmation");
Assert.NotNull(confirmation);
}

I don’t care about implementation, I just care about what I want to assert.

But I think that I am getting side tracked to another subject, so I’ll stop here and post about separating asserts from their scenarios at another time.

Distributed Source Control

I had a short discussion with Steve Bohlen about distributed source control, and how it differs from centralized source control. After using Git for a while, I can tell you that there are several things that I am not really willing to give up.

  • Fast commits
  • Local history
  • Easy merging

To be sure, a centralized SCM will have commits, history and merging. But something like Git takes it to a whole new level. Looking at how it changed my workflow is startling. There is no delay to committing, so I can commit every minute or so. I could do it with SVN, but it would take 30 seconds to a minute to execute, blocking my work, so I use bigger commits with SVN.

Having local history means that I can deal with a lot of small commits, because diffing a file from two commits ago is as fast as diffing the local copy. I tend to browse around in the history quite a lot, especially when I am doing stuff like code reviews, or trying to look at how I did something three weeks ago.

Merging is another thing that DVCS excels at. Not so much because of better merge algorithms (although that is a part of this), but simply because having all the information locally make the merge process so much faster.

All in all, it end up being a much easier process to work with. It takes time to get used to it, though.

And given those requirements, Fast commits, Local history, Easy merging, you pretty much end up with a distributed solution. Even with a DVCS, you still have the master repository, but just the fact that you have full local history frees you from having to manage all of that.

Pair design sessions

It isn’t just pair programming that is really useful. I had a problem that I found horrendously complicate to resolve, and I got on the phone with the rest of the team, trying to explain to them what I wanted and how I wanted to achieve that.

Luckily for me, they were too polite to tell me that I am being stop and that I should stop whining. But they were able to guide me toward an elegant solution in about fifteen minutes. Until at some point I had to say: “I really don’t understand why I thought this was hard.”

Getting feedback is important, be it on code or design.

ayende.com move process completed

The server is now hosted at GoGrid, it took longer than I anticipated because I also moved it to EC2 to test that (post about this is already in the queue, and will show up in about 2 weeks).

Commenting is now enabled, and it all should just work. Please let me know if something is broken.

I wrote it, and it was horrible

I needed to handle some task recently, so I sat down and did just that. It took me most of a day to get things working.

The problem was that it was a horrible implementation, and it was what you might call fragile:

image

I don’t usually have the patience to tolerate horrible code, so after I was done, I just reverted everything, without bothering to stash my work somewhere where I could retrieve it later. That sort of code is best kept lost.

Time lost: ~12 hours.

I talked with other team members about how to resolve the problem and they made me realize that there isn’t a great deal of difficulty in implementing that and that I am just being an idiot, as usual. With that insight, I spent maybe two hours in rebuilding the same functionality in a much more robust manner.

I could also reuse all my understanding on how things should behave, now that I knew all the places that needed to be touched.

Overall, it took me about 14 hours (spread over three days) to implement the feature. Scrapping out everything and starting from scratch really paid off, I invested 15% of the original development time, but I got a robust, working solution.

Trying to fix the previous implementation would have taken me significantly longer, and would result in a fragile bit of code that would likely need to be touched often.

Areas of pain vs. frequency of change

A few weeks ago I had to touch a part of NH Prof that is awkward to handle. It isn’t bad, it is just isn’t as smooth to work with as the rest of NH Prof.

I had the choice of spending more time there, making this easier to work with, or just dealing with the pain and making my change. Before touching anything, I looked at the commit log. The last commit that touched this piece of NH Prof was a long time ago.

That gave validity to my decision to just deal with the pain of changing it, because it wouldn’t be worthwhile to spend more time on this part of the application. Noticing areas of pain and fixing them is important, but I am willing to accept areas of pain in places that I need to touch twice yearly.

ayende.com is moving servers – some interruption may result

Well, the blog has grown a bit too large for my current host, and I decided that I need to move it elsewhere.

In order to make the move easier, I am disabling commenting site-wide. I’ll try to make this as fast as possible.

Tags:

Published at

A thread static instance?

Without running this code, what would you expect this to do?

public class Strange
{
[ThreadStatic]
public /* static */ int Value;
}



var s1 = new Strange {Value = 1};
var s2 = new Strange { Value = 2 };
Console.WriteLine(s1.Value);
Console.WriteLine(s2.Value);
ThreadPool.QueueUserWorkItem(state =>
{
Console.WriteLine(s1.Value);
Console.WriteLine(s2.Value);
});

Can you guess?

Microsoft Courier

There are some things that I just don’t get. The new MS Courier is one such thing. Check the link, this is basically a book size iPhone/tablet.

Go checkout the video, you’ll notice something that will kill this device.

It uses a pen, to write!

Leaving aside the fact that no OCR program has yet been able to figure out what I am writing (including the one in my brain), using a pen it annoying.

I write about three to four times as fast using a keyboard than using a pen (and I can use both hands). And I don’t write, using archaic pen & paper, much at all. That affect my writing readability when I am using a pen, of course, leading to a feedback cycle.

This pretty much turn me off of the device completely.

Things you never want to hear from a support person dealing with a production server crash

I got all of the following during a recent support call to deal with a hard server crash:

  • I have never seen something like that before.
  • Our internal tools crash when they connect to your servers.
  • We are probably not going to recover this server.
  • You do have a backup, right?

Thankfully, it turned out to be something stupid that was fixed very quickly. But it sure was scary.

Published at

Originally posted at

Comments (4)

NH Prof stress testing

Just run NH Prof through a stress test to see how well it handles scenarios where you just throw data at it.

I run into a few interesting bugs (mostly related to threading), which were luckily simple enough to handle.

The end result was:

image

Which is pretty impressive, I think. Just to give you an idea, we used to test NH Prof against the NHibernate integration test suite, which isn’t even half of what I threw at it now.

There is a small problem, however, the profiled application is capable of outputting data faster than NH Prof can process it. That means that unless there are some rest periods, NH Prof is in somewhat of a problem here, because it will never catch up.

I’ll post some more about how we can solve this.

How to lead a convoy to safety

image I recently run into a convoy situation in NH Prof. Under sustained heavy load (not a realistic scenario for NH Prof), something very annoying would happen.

Messages would stream in from the profiled application faster than NH Prof could process them.

The term that I use for this is Convoy. It is generally bad news. With NH Prof specifically, it meant that it would consume larger and larger amounts of memory, as messages waiting to be processed queued up faster than NH Prof could handle them.

NH Prof uses the following abstraction to handle queuing:

public interface IQueue<T>
{
void Enqueue(T o);
T Dequeue();
bool IsEmpty { get; }
}

Now, there are a few things that we can do to avoid having a convoy. The simplest solution is to put some threshold on the queue and just start dropping messages if we reached it. NH Prof is actually designed to handle such things as interrupted message stream, but i don’t think that this would would be nice thing to do.

Another alternative would be write everything to disk, so we don’t have memory pressure and can handle much larger queue sizes. The problem is, of course, that this requires something very subtle. T now must be serializable, and not just T, but everything that T references.

Oh, Joy!

This is one of the cases where just providing the abstraction is not going to be enough, providing an alternative implementation means having to touch a lot of other code as well.

Open Source development model

As someone who does a lot of Open Source stuff, I find myself in an interesting position in the CodePlex Foundation mailing list. I am the one who keep talking about letting things die on the vine if they aren’t successful on their own.

I am going to try to put a lot of discussion into a single (hopefully) coherent post. Most of the points that I am going to bring up are from the point of view of an OSS project that got traction already (has multiple committers, community, outside contribution).

One of the oft repeated themes of the conversation in the CPF mailing list is that the aim is to encourage OSS adoption and contributions to OSS in businesses and corporations.

That sounds nice, but I don’t really get why.

From the business side: if a business don't want to use OSS, then it is in a competitive disadvantage compared to its competitors that do make use of it, since OSS projects tend to make great infrastructure and generate high quality base to work from. If you choose to develop things in house it is going to cost you a lot. And you are likely going to end up with an inferior quality solution.

This is not to disparage someone’s effort, but a OSS project that got traction behind it is likely to have a lot more eyes & attention on it than a one off solution. The Java side has demonstrated that quite clearly.

Even in the .Net world, I can tell you that I am aware of Fortune 50 companies making use of things like NHibernate or Castle. They can most certainly fund building a project of similar size, but it doesn’t make economic sense to do so.

From the project side, if you got enough traction, you don't generally worry about the OSS fearing businesses. It is their loss, none for the project.

It would be more accurate that the project won't feel any pain if a business decide not to use it. Remember that unlike commercial software, OSS projects don't really have an incentive to "sell" more & more.

There is the incentive to grow bigger (for ego reason, if nothing else), get more people involved, add more features, etc. But unless there is some business model behind it (and in the .NET world, there are very few projects with a business model behind them), growing the project usually mean problems for the project team.

As a simple example, Rhino Mocks mailing list has an average of 140 messages per month. I had to scale down my own involvement in the mailing list because it took too much of my time. The NHibernate Users mailing list is crazy, averaging in a thousand messages a month this year alone.

That is even assuming that I want traction for a project, which isn’t always the case. As a good example, I have a lot of stuff that I put out as one-use only solutions. Rhino Igloo is a good example of that, a WebForms MVC framework that we needed for a single project. I built it as OSS, we get contributions for it once in a while. But if it gets to be *very* active, I am going to find myself in a problem, because I don't really want to maintain it anymore.

But in general, for most projects I do want to have more contributors. In the CPF mailing list the issue of getting contributions from companies was brought up as problematic. I disagree, since I don't find that the problems that were brought up (getting corporate and legal sign up for contributing work, getting people to adopt OSS for commercial uses) has any relevance whatsoever to getting more contributors. By far, most contributions that we get for the projects I am involved at are from people making commercial use of them.

But usually, I don’t really care about adoption.  I have 15 - 20 OSS projects that I have either founded or am a member of, in exactly one of them I cared about adoption (Rhino Mocks), and that was 5 years ago, mainly because I thought it would give me some credentials when I was looking for a job (and it did).

For all the rest, I am working on those because I need them to solve a problem. I get the benefit that other people are going to look at them and contribute if they feel like it, but mostly, I am working on OSS to solve a problem, the number of users in a project isn't something that I really care about.

There were three scenarios that were discussed in the mailing list that I want to address in particular.

A company would like to pay you 5 times your normal rates, but they have a “no OSS” policy, thus losing the contract.

I have to say that this scenario never happened to me.  Oh, I had to talk with the business a lot of time. It is easy to show them why OSS is the safer choice.

Today, that is fairly easy. I can point out stats like this: http://www.ohloh.net/p/nhibernate and that trying to build something like NH is going to cost you in the order of 130 years and ~15 millions dollars. I can tell them that going with MS data access method is a good way to throw good money at upgrading their data access methodology every two years. I can point them to a whole host of people making good use of it.

I got lots of arguments to use. And they tend to work, quite well, in fact. I may need to talk to the lawyers, but that has generally been a pretty straightforward deal.  So no, I don't lose clients because of no OSS rule.

Beside, you know what, if they are willing to pay me 5 times my normal rate, I am going to be very explicit about making my preferences made and explaining the benefits. Afterward, they are the client, if they want to may me gobs of money, I am not going to complain even if I am going to use NIH as the root namespace.

Corporate developers have a problem getting permission to use OSS projects in their product or project.

I have seen it happen a few times in the past, but it is growing rarer now. The main problem was never legal, it was the .NET culture more than anything else. The acceptance of OSS as a viable source of software had more to do with team leads and architects accepting that than any legal team putting hurdles in the path.

Oh, you still need to talk to legal before, but you are going to do that when bringing a 3rd party component anyway. (You do make sure to run any commercial legal agreements through the legal department, right? You need to know that there aren’t hooks involved there).

Corporate developers have a problem getting permission to contribute to OSS projects.

Once OSS is adopted, I never run into an issue where legal stopped the contribution of a patch. There are damn good reasons for the business to want this, after all.  To that manager, I am going to say: "look, we can maintain it, or we give it to the project, they maintain/fix/debug/improve it. we get great credits and we gain a lot for work we would have done anyway"

A few final thoughts, OSS projects are a dime a dozen. In the .Net space alone there are thousands. Most of them, I have to say, aren’t really that interesting. Out of those thousands of projects, there are going to be a few that would get traction, attract additional committers, outside contributions and a community.

I think it would be safe to say that there are around fifty such projects in the .Net space. There is nothing particularly noble or important in an OSS project that requires special treatment. If it gets enough attention, it will live on. If it doesn’t, who cares (except maybe the author)?

The CodePlex Foundation, however it may end up as, is going to be dealing with the top fifty or so projects, simply because trying to reach the long tail of OSS projects is a futile task. I mentioned what I think would be good ways of helping the top projects (resources, coaching, money).

Another approach would be to turn it around, the CPF can focus on building a viable business model for OSS projects. A healthy OSS project is one that makes money for the people who contribute to it. It may be directly or indirectly, but if it isn’t going to do that, it isn’t going to live long. A labor of love would keep one or two committers working on a project, but it wouldn’t generally sustain a team.

Finally, something that I think seems to get lost in all the noise around it, Open Source projects are about the code. I hear a lot about legal issues, making business adopt OSS, etc. I don’t see discussion about the main thing.

Show me the code!

I knew there were reasons for those multi threading warning lists

One of the warning signs for bad multi threaded code is calling code that you don’t own while holding a lock. I just discovered a bug in NH Prof that is related to just that issue.

But I wasn’t really stupid, and the story of this bug is a pretty interesting one.

Let us look at what happened. I have a deadlock in the application, where two threads are waiting for one another. In the UI thread, I have:

image

And on a background thread, I have:

image

This is actually code that happen in two separate classes. And it sounded very strange, until I walked up the stack and found:

image

All nice and tidy, and it looks good. Except that Sessions.Add invoke the SessionsCollectionChanged, which perform a block UI call, while the UI is trying to enter a lock that is currently being held.

Like I said, this is a classic threading issue, but it is pretty hard to see, unless you can actually get the stack trace.

Two strikes, and you are out

I don’t have a lot of patience for repeated bugs. I just got a bug report that turn out to be in the result of the same feature as a previous bug I fixed.

I could have fixed the bug. But I didn’t bother. A repeated bug in the same area for the same reason usually indicate a fragile design. In this particular case, the feature wasn’t important, so I just ripped it all out, root & branch.

If it was important, I would have still ripped the whole thing apart, and then I would rebuild it from scratch, using a different design.

Fragile designs are one of the worst enemies that you can have, they will keep dropping things in your lap until you finally fix them once and for all. And I find that usually just starting from scratch on a feature implementation is the best way of doing that.

Optimizing Windsor

Recently we got a bug report about the performance of Windsor when registering large number of components (thousands). I decided to sit down and investigate this, and found out something that was troublesome.

Internally, registering a component would trigger a check for all registered components that are waiting for a dependency. If you had a lot of components that were waiting for dependency, registering a new component degenerated to an O(N^2) operation, where N was the number of components with waiting dependencies.

Luckily, there was no real requirement for an O(N^2) operation, and I was able to change that to an O(N) operation.

Huge optimization win, right?

In numbers, we are talking about 9.2 seconds to register 500 components with no matching dependencies. After the optimization, we dropped that to 500 milliseconds. And when we are talking about larger number of components, this is still a problem.

After optimization, registering 5,000 components with no matching dependencies took 44.5 seconds. That is better than before (where no one has the patience to try and figure out the number), but I think we can improve up it.

The problem is that we are still paying that O(N) cost for each registration. Now, to suppose systems that already uses Windsor, we can’t really change the way Windsor handle registrations by default, so I came up with the following syntax, that safely change the way Windsor handles registration:

var kernel = new DefaultKernel();
using (kernel.OptimizeDependencyResolution())
{
for (int i = 0; i < 500; i++)
{
kernel.AddComponent("key" + i, typeof(string), typeof(string));
}
}

Using this method, registering 5,000 components drops down to 2.5 seconds.

I then spent additional time finding all the other nooks and crannies where optimizations hid, dropping the performance down to 1.4 seconds.

Now, I have to say that this is not linear performance improvement. Registering 20,000 components will take about 25 seconds. This is not a scenario that we worry over much about.

The best thing about the non linear curve is that for a 1,000 components, which is what we do care about, registration takes 240 milliseconds. Most applications don’t get to have a thousand components, anyway.

There are also other improvements made in the overall runtime performance of Windsor, but those would be very hard to notice outside of a tight loop.

More NHibernate commercial support options

The NHibernate scene is growing bigger.

I am starting to see NHibernate related jobs crossing me mailbox, NHibernate 2.1 downloads crossed the 20,000 downloads a while ago and a very active mailing list.

iMeta, who sponsored the development of the AST based parser and is currently sponsoring the Linq for NHibernate effort, is now also offering Commercial Support for NHibernate.

This is a separate offer from mine, and seems to be targeting a different audience.

From my point of view, the more the merrier, it simply shows that the market for NHibernate is growing, fast.

NH Prof: The beta mistake

One of the things that I should have done from the get go of the beta program was to force people to upgrade when their version is too old.

Why would I want to do that? Because people that continue to use old versions, and old versions has old bugs.

Take a look at this error report:

image

This version is almost a hundred builds old, and the error reported was fixed long ago. I also get errors from people using older versions than that.

When I built the error reporting feature, I was concerned about the user privacy, so I don’t have any information about the user that can be used to identify him. I am reduce to this public plea: Just upgrade NH Prof already!

Find the differences: The optimization that changed behavior

I was thinking about ways of optimizing NHibernate Search’ behavior, and I run into a bug in my proposed solution. It is an interesting one, so I thought that would be make a good post.

Right now NHibernate Search behavior is similar to this:

public IList<T> Search<T>(string query)
{
var results = new List<T>();
foreach(var idAndClass in DoLuceneSearch(query))
{
var result = (T)session.Get(idAndClass.ClassName, idAndClass.Id);
if(result != null)
results.Add(result);
}
return results;
}

This isn’t the actual code, but it shows how NHibernate works. It also shows the problem that I thought about fixing. The way it is implemented now, NHibernate Search will create a SELECT N+1 query.

Now, the optimization is simply:

public IList<T> Search<T>(string query)
{
return session
.CreateCriteria<T>()
.Add(Restrictions.In("id", DoLuceneSearch(query).Select(x=>x.Id)))
.List();
}

There are at least two major differences between the behavior of the two versions, can you find them?

Challenge: NH Prof Exporting Reports

One of the things that I am working on right now is exporting NH Prof reports. This isn’t an earth shattering feature, the idea is to take a report like this:

image

And give you the ability to save it in a format that you can send to your DBA, or maybe just put it away for later perusal.

I started with the simplest thing that could possibly work, exporting the reports to XML. Using Linq to XML, that has been a true pleasure, and got me this:

<?xml version="1.0" encoding="utf-8"?>
<unique-queries xmlns="http://nhprof.com/reports/2009/09">
<query average-duration="0.1" count="10" is-cached="false" is-ddl="false">
<duration database-only="0" nhibernate="0" />
<short-sql>SELECT ... FROM Comments comments0_ WHERE comments0_.PostId = 1</short-sql>
<formatted-sql>SELECT comments0_.PostId as PostId1_,
comments0_.Id as Id1_,
comments0_.Id as Id5_0_,
comments0_.Name as Name5_0_,
comments0_.Email as Email5_0_,
comments0_.HomePage as HomePage5_0_,
comments0_.Ip as Ip5_0_,
comments0_.Text as Text5_0_,
comments0_.PostId as PostId5_0_
FROM Comments comments0_
WHERE comments0_.PostId = 1 /* @p0 */</formatted-sql>
</query>
</unique-queries>

That is pretty easy to do, but then I hit a wall. This is good, but it isn’t nearly enough. I can imagine that most of NH Prof users will want access to the raw data in this format, so they can slice & dice it as they see fit, but I believe that I also need to provide a more readable form.

I started to look at how I can generate PDFs from this, and I run into a problem. I don’t have any experience doing such things, and while I don’t doubt that I could learn it, I don’t think that what is very likely to be a one off requirement is worth spending the time to really learn the topic.

Therefore, I decided to try something a little bit different. I uploaded four sample reports here. I want to see if someone can provide me with solution to convert those to a nice PDF format. 

I am going to give away one NH Prof license to the first one that can show me a solution that I can use with NH Prof.

The selected solution will also be featured in this blog, of course.

The small print: Just to be clear, some variation on the selected solution is going to end up within NH Prof. I don’t think it is a big task for someone who know what they are doing, and I believe that the NH Prof license offer is adequate compensation for that.

Append Only Models with NHibernate

I mentioned that using an append only model isn’t just an infrastructure choice, it has a big effect on your API and your general approach to working with your model.

Let us look at a typical example of change the martial status of an employee:

// transaction is opened before method by the infrastructure
public void Consume(ChangeMaritalStatus msg)
{
var emp = Session.Get<Employee>(msg.EmployeeId);
emp.ChangeMaritalStatus(msg.NewMaritalStatus, msg.MaybeNewSurname);
}
/*
transaction is committed here and changes are flushed to the database
by the infrastructure
*/

As you can see, using this approach, this will issue an update statement from NHibernate. This is a typical model for using NHibernate.

Next, let us look at the same action, using an append only model:

// transaction is opened before method by the infrastructure
public void Consume(ChangeMaritalStatus msg)
{
var emp = Session.GetLatest<Employee>(msg.EmployeeId);
var newEmpVersion = emp.ChangeMaritalStatus(msg.NewMaritalStatus, msg.MaybeNewSurname);
Session.Save(newEmpVersion)
}
/*
transaction is committed here and changes are flushed to the database
by the infrastructure
*/

Notice what is going on in here. We changed two major things, first, we moved from just using a Get<Employee>(), an NHibernate method, to using GetLatest<Employee>(), which is an extension method. Second, where before we relied on NHibernate’s change tracking to do the deed for us, now we get a new version from the method and we save it explicitly.

If it reminds you of the functional model, this is accurate, in the append only model, data truly may not change, any update is just a copy of the old data plus whatever changes you wish to make.

GetLatest<TEntity> implementation is going to depend on how you actually manage the data in the database. I would usually recommend something like the following table structure:

CREATE TABLE Employees
(
Id bigint not null,
Version int not null,
MaritalStatus int not null,
/*
other columns
*/
PreviousVersion int null,
PRIMARY KEY (Id, Version)
FOREIGN KEY (Id, PreviousVersion) REFERENCES Employees(Id, Version)
)

Some people prefer to use dates for marking the differences between versions, but there are some significant advantages for using a version number. You have no fear of collisions or inaccuracy, which may happen with dates. It is also easier to read something like Employee # 12312/4 rather than Employee # 12312 / 2008-08-01T10:04:49.

You can just turn all queries in the database to something like:

SELECT * FROM Employees this
WHERE Id = 12312 AND Version = (
SELECT max(Version) FROM Employees version
WHERE version.Id = this.Id
)

But this has some problems, it forces a scan of at least part of the table (although it will be at least an index scan, if the DBA is worth anything at all) which is a lot of work. We can probably get far better results by using:

CREATE TABLE LatestVersions
(
Type nvarchar(255) NOT NULL,
Id bigint NOT NULL,
Version int NOT NULL
PRIMARY KEY (Type, Id)
)

SELECT * FROM Employees
WHERE Id = 12312 and Version = (
SELECT Version FROM LatestVersions
WHERE Type = 'Employees' AND LatestVersions.Id = Employees.Id
)

I haven’t run any comparison tests on this, but this is likely to be faster. Regardless, I consider it more elegant.

The GetLatest<TEntity> simply use HQL or Criteria API to make NHibernate perform this query.

I usually also install a pre update event listener that just throws when it encounters an update (which is not allowed in the system).

And yes, beyond these two changes, everything else just flows. You never update, you always get by latest, and that is about it. Well, almost, you also need to make sure that NHibernate creates the primary key properly, by incrementing the version and updating the LatestVersions table. This can be done using an IIDentityGenerator.

There is a side topic that I am not going to discuss right now relating to how you are going to handle the reporting model. This type of model is great for a lot of reasons, versioning, auditing, etc. What is it not great at is reading data.

The problem is that you generally have a model that doesn’t work that well for reading from the database or for showing on the screen. What usually happen is that there is a separate process that is responsible for taking this data and turning that into a more palatable from for reporting purposes (and remember that I consider reporting to be also show me the details screen of employee #12312).

In particular, queries on this type of model are awkward, slow and tend to be annoying in general. I heartily recommends using a separate reporting model.