Who stole my transaction?
I just run into an extremely strange bug with the System.Transactions API. It appears that under certain circumstances, you can exit the transaction scope before it has finished committing. Here is the code to reproduce this:
public class EnlistmentTracking : IEnlistmentNotification { public static int EnlistmentCounts; public EnlistmentTracking() { Interlocked.Increment(ref EnlistmentCounts); } public void Prepare(PreparingEnlistment preparingEnlistment) { preparingEnlistment.Prepared(); } public void Commit(Enlistment enlistment) { Interlocked.Decrement(ref EnlistmentCounts); enlistment.Done(); } public void Rollback(Enlistment enlistment) { Interlocked.Decrement(ref EnlistmentCounts); enlistment.Done(); } public void InDoubt(Enlistment enlistment) { Interlocked.Decrement(ref EnlistmentCounts); enlistment.Done(); } }
This class simply tracks the number of instances that it has. It does no blocking and operates entirely in memory.
Here is the code to to show the problem:
var newGuid = Guid.NewGuid(); for (int i = 0; i < 100; i++) { using(var tx = new TransactionScope()) { Transaction.Current.EnlistDurable(newGuid, new EnlistmentTracking(), EnlistmentOptions.None); Transaction.Current.EnlistDurable(newGuid, new EnlistmentTracking(), EnlistmentOptions.None); tx.Complete(); } Console.WriteLine(Thread.VolatileRead(ref EnlistmentTracking.EnlistmentCounts)); }
This just run in a loop, creating two instances of the enlistment (forcing it to be distributed transaction), and commit the transaction. After the transaction is completed, we read how many enlistments are still alive. Surprisingly, I keep getting non zero values here.
The really freaky part is that if I’ll put a small wait there, I’ll get zero value back, which is what I would expect. This is on .NET 4.0, by the way.
Let us look at the documentation for Dispose:
This method is synchronous and blocks until the transaction has been committed or aborted.
Hmm… that is not what I am seeing here.
Any idea what is going on?
From what I see here, I would say that it is only waiting until Prepare is called, not until Commit / Rollback is called. The way I implemented things, prepare does all the actual work, but it is the commit that switch things around so those changes are visible. The result of this behavior is that until Commit has been called, the transaction has not been really committed.
It appears that what I am supposed to do is:
- On prepare, commit the transaction, but keep around the data required to roll it back.
- On commit, cleanup everything that is required to do the cleanup.
- On rollback, use the cleanup data to rollback the transaction.
- On doubt, dance a merry jig and then throw yourself off the bridge.
But that is based on the behavior of the code, not on what I am seeing on the docs, and it is seems wrong.
Comments
Debug or Release build? Running under debugger? Perhaps disable inlining?
Release & Debug
With & without debugger.
Inlining doesn't matter
Is it a distributed transaction (therefore managed by MSDTC) ?
Steve,
Yes
Do you still get the error if you use call the TransactionScope constructor with TransactionScopeOption.RequiresNew?
Weird. It appears that you're right.
I also failed to get Rollback being called after forcing rollback from the prepare method by calling preparingEnlistment.ForceRollback().
I hope some from the framework team would answer this.
Have you filed a Connect bug on this? (I know you don't believe in Connect, but still).
Sorry - my failure to get Rollback called was a bug on my side.
But your observation still stands.
Both Prepare() and Commit() execute on a thread different from the main thread, that executes TransactionScope.Dispose(). The documentation is accurate in that you will not observe the Dispose() method returning before the Prepare() method has been synchronously called and the DTC decided to commit the transaction.
However, my understanding of the documentation (and the actual implementation) is that the Commit() method is called after the transaction outcome has been decided, and all the parties involved have no way of effecting any changes to the transaction.
If everyone properly uses transactions or at least locks everything properly, there shouldn't be any visible problem as a result of this. For example, say that after the end of the tx scope you start another transaction that relies on the changes committed by the first transaction. The first transaction would not have released its locks before the Commit() method returned, so the second transaction will wait for these locks if it relies on the same data.
In other words, how is this such a big problem?
If you enable tracing on System.Transactions, you'll see that the transactions are committing and disposing when the using block goes out of scope, but the notification callback sometimes happens after the TransactionScope constructors are called on the next iteration.
This leads me to believe that it's probably using an AsyncCallback
Sasha,
That is a big problem, what happen if you don't rely on locks for transactions?
Case in point, and how we got this error, is a case where we work in a lock free transactional system (MVCC).
If you read immediately after the transaction, you don't get locked, you get the _committed version_, which isn't what you just finished committing.
Greg,
The problem isn't with the async, the problem is that Dispose returns before the transaction is actually committed.
The documentation For IDBTransaction says thatit is for use with relational databases. It should probably read that it is for use with a database that implements locking strategies.
Given that RavenDB does not use locks, but rather MVCC, would it be possible to create your own non-IDBTransaction implementation so that MSDTC would not be utilized?
John,
There is no IDbTransaction used anywhere here.
EnlistVolatile works as you expect but the stack trace looks very different - don't know enough about all this business to see where the exact differences are. All in all it is an amazing find indeed. Looks like the EnlistmentNotification implementations out there don't really commit in the Commit implementation of the EnlistmentNotification instance, as this apparently would be a pretty bad place to do so.
Frank,
With Volatile, you don't actually get DTC
This is very worrying. What's the point of transactions where you can't be sure if they've been committed or not?
I'm having the same problem over here. The problem is indeed that the Dispose causes the Commits to be executed on a background thread, and returns immediately.
For me this is a big problem, because after the commit my application (which is a console application) exits, and some of the commits are never executed.
To make things worse: there is no way of knowing when the commits are done. Even the Transaction.TransactionCompleted event is fired before all commits have executed.
The only solution I have found so far is to either do a Thread.Sleep for a certain time, or to build in the loop below to check if some worker threads are still busy.
If you have found a better solution, I would be glad to know...
int workerThreads = 0;
int completionPortThreads = 0;
int maxWorkerThreads;
int maxCompletionPortThreads;
ThreadPool.GetMaxThreads(out maxWorkerThreads, out maxCompletionPortThreads);
while(workerThreads!=maxWorkerThreads || completionPortThreads!=maxCompletionPortThreads)
{
}
Comment preview