Ayende @ Rahien

My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:


+972 52-548-6969

, @ Q c

Posts: 6,128 | Comments: 45,549

filter by tags archive

Your queries are going to fail in production

time to read 2 min | 301 words

Reading Davy Brion’s post There Is No Excuse For Failing Queries In Production had me twitching slightly. The premise is that since you have tests for your queries, they are not going to fail unless some idiot goes to the database and mess things up,and when that is going to happen, you are going to trip a circuit breaker and fail fast. Davy’s post is a response to a comment thread on another post:

Bertrand Le Roy: Would it make sense for the query to implement IDisposable so that you can put it in a using block? I would be a little concerned about the query failing and never getting disposed of.
Davy Brion: well, if a query fails the underlying SqlCommand object will eventually be garbage collected so i wouldn’t really worry about that. Failed queries shouldn’t happen frequently anyway IMO because they typically are the result of incorrect code.

The problem with Davy’s reasoning is that I can think of several very easy ways to get queries to fail in production even if you go through testing and put your DB under lock & key. Probably the easiest problems to envision are timeouts and transaction deadlocks. Imagine the following scenario, your application is running in production, and the load is big enough (and the request pattern is such that) transaction deadlocks begin to happen. Deadlocks in the DB are not errors. Well, not in the usual sense of the word. They are transient and should be dealt separately.

Using Davy’s approach, we would either have the circuit breaker tripping because of a perfectly normal occurrence or we may have a connection leak because we are assuming that all failures are either catastrophic or avoidable.



It might be irrelevant to your post but since you have mentioned deadlocks , I really like to know what's your advise (as a NHibernate professional) to prevent them ? As a matter of fact I've been working on a projects for 4 years and we are using NHibernate. In rare cases we are getting deadlocks .we have tried to trace them in SQL server logs but we haven't had any success to get rid of them.

Ayende Rahien


There aren't really ways of getting rid of deadlocks.

They are what happens when you have two transactions competing for the same resources.

In theory, you can try to restructure your queries so you will always access rows using the same path all the time, thus eliminating deadlocks.

In practice, it is not really possible to do something like that.

You can reduce deadlocks by specifying a lower transaction isolation level, but preventing them isn't likely in real world scenarios.

Luckily, recovering from them is very simple

Steve Degosserie

All true. It also makes me think about the common assumption most developers have about Transaction Commits that generally don't fail. There are also a number of cases where this can happen (e.g. deferrable constraints in Oracle) but is usually not well handled.

Basically, to develop robust software, you must be paranoid :) (re-re-read M. Nygard's excellent Release It !).


Ayende, can you elaborate on recovering aspects? Timeouts ???

Alex Yakunin

Fully agree. Failing query in production is normal. Deadlocks, version conflicts & timeouts are typical cases. Their ignorance = lack of actual understanding of processes @ RDBMS.

Alex Yakunin

Or simply a mistake, of course. I hardly believe Davy isn't aware about this.


Would be nice to just be able to replay the transaction. But SQL server has so many bugs related to its transaction handling, that you never quite know when your transaction is broken.

And in usual style, they pretty much ignore whatever bug reports they get.

Ayende Rahien


Do you have a repro that shows that transaction handling is broken on SQL Server?

Ayende Rahien

Interesting, but while this is a bug, it doesn't relate to what I consider the most important thing about transactions: ACID

You don't have data corruption here, you have a transaction that mysteriously dies (and rollback).

As long as you can trust ACID, you can replay transactions


Yes and No... Since the transaction is stopped without your knowledge and no error handling is done, then all subsequent writes are still written.

Meaning that you cannot just assume that none of your work was completed, and can thus not repeat the transaction.

btw. The "Remember me?" keeps forgetting to remember itself, thus forgetting me the second time a post is made.

Ayende Rahien

The remember me shouldn't be on a second time, you are already remember. IOW, it shouldn't forget you.

And I fully agree about the problematic nature of silent aborts.

Have you tried contacting PSS?


No. They are a waste of time also...

Last time they spent weeks finding out why our failover setup was failing. Just to have us find out that they dont really support failover unless it is in a domain setup.

So in usual style, I implemented a workaround, which in this particular case actually ended up much cleaner.


I did remember me... Not quite sure why I am forgotten every second time I have something useful to say. Perhaps the timeout is too low?

Just checked, yes indeed just 1 month timeout on the cookie.

Paul Robson

Ayende, when you mention "replaying the transaction" above, do you mean throw away the session and restart the whole unit of work from the beginning?

Ayende Rahien


Yes, you have to. You can't just execute the same statements, there is logic that may be affected


The most common case of failing queries in production is IMHO a database update with feature change. This happens often especially with MySQL (even for minor version changes), but we've seen it with other DBMS too, e.g. PostgreSQL.

Mike Brown

Something was striking me as odd about his assertions there, and even if you have the most stringent of testing, some environments (e.g. FDA regulated) still require you to place checks in your code to handle scenarios that "are never supposed to happen"

Comment preview

Comments have been closed on this topic.


  1. The worker pattern - 2 days from now

There are posts all the way to May 30, 2016


  1. The design of RavenDB 4.0 (14):
    26 May 2016 - The client side
  2. RavenDB 3.5 whirl wind tour (14):
    25 May 2016 - Got anything to declare, ya smuggler?
  3. Tasks for the new comer (2):
    15 Apr 2016 - Quartz.NET with RavenDB
  4. Code through the looking glass (5):
    18 Mar 2016 - And a linear search to rule them
  5. Find the bug (8):
    29 Feb 2016 - When you can't rely on your own identity
View all series


Main feed Feed Stats
Comments feed   Comments Feed Stats