Ayende @ Rahien

It's a girl

Working around SQL Server bugs

Okay, I mentioned that I am working on fairly complex piece of query code, something that is completely off the charts, even for me. As such, I have completely test driven that. How do I Unit Test a complex query? By using in memory database (in this case, SQLite). NHibernate makes it very easy to switch databases, and I never gave it much thought after making it work for the first time.

Until today, where it was time to do an additional tweak to my query. It worked perfectly on the tests, running against SQLite, but failed when running the application against SQL Server. Here is the query that demonstrate the problem:

select address from Address address, City city left join fetch address.Street
             where city.Name = address.City

The generated query looked like:

select [fields list] from Addresses addresses0_, Cities cities1_ left outer join Streets street2_ on addresses0_.Street = street2_.Id  
            where...

This fails which unknown "addresses0_.Street" error, I run into the problem before, and I looks like a bug in SQL Server, or a nasty (appernatly undocumented) surprise.

This was my critical query, I had to run it. In desperation, I opened NHibernate source, and started debugging into it, in order to see how the query is being composed. Haven't looked very closlely at this area in a while, and things have changed somewhat, but I had that sinking feeling that it was one of those really big changes that I am really not ready to make close to the time that I want to go home.

Then I did something that I was sure would fail, I changed the query to:

select address from Address address left join fetch address.Street , City city
             where city.Name = address.City

It worked, produced SQL that would make SQL Server happy, SQLite was able to process this SQL without any issues either. Problem solved, and will hopefully remain in memory for the next time that I need it.

Oh, and if you didn't notice, today was a jumpy day. I am going to do some deep end stuff to relax.

Comments

Marvin
05/29/2007 08:51 PM by
Marvin

It's good that you bring this up.

Unfortunately I can't use an in-memory DB for unit tests at this time because we have old software that is wired to a crappy SQLBase database.

We would like to migrate to SQL Server, which is one reason why we need more unit tests.

One algorithm I'd like to test goes like this (it's actually something else, but you'll get the idea):

A customer has a birthday (DateTime) and an age (int).

We want to update the customer's age.

So the algorithm goes like this:

  • Load a given customer.

  • If he doesn't have a birthday, send a message to the boss (create an entry in the MESSAGES table), telling him that an employee has forgotten to ask the customer for his birthday ;-). Abort the operation.

  • Calculate the new age.

  • If the new age differs from the old age, send a message to the developer (again, an entry in the MESSAGES table), telling him his code is crap ;-).

  • Finally, if his birthday was within last week, send him a gift, based upon his last order (evaluate last order, create new order).

As you can see, there are quite a lot of tables involved.

At least:

CUSTOMERS

EMPLOYEES

MESSAGES

ORDERS

PRODUCTS

ADDRESSES

Futhermore, there are lots of methods involved (if StoreCustomer() fails, the whole operation will fail), which all will have to be tested.

How do you do that?

I could of course use an empty database and insert all objects I will need (including his last order) and delete everything again once I'm done.

But as there are many columns etc., I'd prefer a different approach.

Like configuring the database using the GUI of our application and then using that database for all unit tests, restoring it after each test.

But how can you tell the unit test which customer to use in an elegant way? I can configure the customer ID in the application settings, but I still don't think that's the best approach.

And (the actual reason why we've decided to add MANY more unit tests):

We recently discovered a horrible bug in NullSafeGet in one of our custom NHibernate datatypes. NullSafeSet on the other hand worked correctly, so data was destroyed.

I then wrote a unit test that loaded, stored and loaded data and compared the values. But even then you can never be sure you've screwed something up (columns in the wrong order in the mapping file etc.).

How would you write a unit test for such a case?

Luke Breuer
05/29/2007 09:03 PM by
Luke Breuer

Don't you prefer to perform all joins with join clauses using the "on" keyword? Or is this not an option with NHibernate? Would you mind posting the generated TSQL from your working solution? I should think the where clause would cancel out the effect of the "left" in the join, but I'm not sure of the rules for odd constructs like you've posted.

Ayende Rahien
05/29/2007 09:05 PM by
Ayende Rahien

I would probably start by testing without a DB at all, testing just the interaction described above, using mocks.

But, if you have NHibernate running than it should be very easy to get it running on an in memory db.

Ayende Rahien
05/29/2007 09:20 PM by
Ayende Rahien

Would you mind posting the generated TSQL from your working solution?

That is three pages worth, I would rather not :-)

Roy Osherove
05/30/2007 02:29 AM by
Roy Osherove

That's one of the reasons why I don't support the notion of writing "unit tests" against a database, or the idea of mocking or replacing a DB for tests. You end up not testing the real DB, and you just might (like in your case) ed up with a different feature set just so it can live up bth to the "fake" and to the "real" database, a feature that was not needed.

Ayende Rahien
05/30/2007 04:06 AM by
Ayende Rahien

Roy,

This is literally the first time that I had run into this issue, and the solution has turned out to be moving the location of a join, so I wouldn't consider this a major concession to living on several databases,

I think that querying is a business concern, and as such, requires testing:

http://www.ayende.com/Blog/archive/2007/03/09/Querying-is-a-business-concern.aspx

Ayende Rahien
05/30/2007 04:40 AM by
Ayende Rahien

Costly,

Writing those unit tests (~30) involved many changes to the code, involving many changes to the schema, and evolved very quickly.

Using in memory I just return the tests, using SQL Server from the start, I would have to re-gen the schema from scratch, which takes time.

Certainly doable (that is how the NH tests are working, after all) just much more costly.

Ayende Rahien
05/30/2007 10:04 AM by
Ayende Rahien

@Jericho,

I have integration tests to catch those type of differences, but I have a lot of tests that does various sorts of queries, and trying to do it against remote server would be really expensive.

Mr_Database
05/30/2007 12:16 PM by
Mr_Database

Just write TSQL if that is what the customer is running. Your stuff will always run and those left behind after you are long gone won't have to learn yet ANOTHER tool just to maintain your code.

Tobin Harris
05/30/2007 12:29 PM by
Tobin Harris

I'm currently testing against SQL2005 db rather than an in-memory db.

I'd agree with Ayende that this approach is too costly - the whole tear-down/set up of dropping/creating the db is becoming annoying to the point where I don't want to run all my tests, which is a BAD THING.

I reckon migrating my tests to SQLite will be very easy, just need to put 10 mins aside to work out how to install the thing and tweak the config for my testing assembly.

Ayende Rahien
05/30/2007 01:13 PM by
Ayende Rahien

@Mr_Database,

TSQL is a highly inefficent language to work with for a lot of scenarios. Building queries happens to be one of those.

Being able to construct queries on the fly, based on business conditions and according to my domain is extremely powerful an enable me to do a lot of stuff that is really not trivial to do otherwise.

Eber Irigoyen
05/31/2007 07:47 PM by
Eber Irigoyen

isn't the first generated query incorrect though?

Ayende Rahien
05/31/2007 08:25 PM by
Ayende Rahien

Can you define what you mean?

It is correct in that it is valid SQL. It is not correct in that SQL Server fails to run it.

Eber Irigoyen
05/31/2007 09:22 PM by
Eber Irigoyen

you may have mistyped stuff, but

your query:

from Address

generated query:

from Addresses

that would be unknown...

Ayende Rahien
05/31/2007 09:28 PM by
Ayende Rahien

No, that is translation from HQL (object based) to SQL (table based)

Ayende Rahien
06/02/2007 01:31 PM by
Ayende Rahien

@Bob,

Just use "data source=:memory:" in the connection string.

Then you need to generate the schema, for that you will need to pass the IDbConnection to the SchemaExport, I added an Execute() method that does that.

Comments have been closed on this topic.