Challenge: Find the bug

This piece of code caused a crashing bug on a QA server ( and we are lucky it didn’t go to production ).

image

Can you spot the bug?

I will give you a hint, it is the code that isn’t there.

Print | posted on Tuesday, May 26, 2009 11:42 PM

Feedback


Gravatar

# re: Challenge: Find the bug 5/27/2009 3:25 AM configurator

Is a transaction or some other synchronization method missing here? That is, it seeks the values, then deletes them. The data could change during the deletions.


Gravatar

# re: Challenge: Find the bug 5/27/2009 8:17 AM Gloubidou

It's in the loop... ?


Gravatar

# re: Challenge: Find the bug 5/27/2009 9:16 AM Vadi

I think, you're missing a check before deleting it ...


Gravatar

# re: Challenge: Find the bug 5/27/2009 10:25 AM Dave

Where does 'data' comes from? (The ApplyToKeyAndActiveVersions line).


Gravatar

# re: Challenge: Find the bug 5/27/2009 11:04 AM Rafal

Ayende, where's your sweet spot in the Esent database? Do you miss good ol' days of WINAPI programming, or is Esent exceptionally performant/stable/powerful/whatever? Maybe there was a reason Microsoft has hidden it?


Gravatar

# re: Challenge: Find the bug 5/27/2009 12:43 PM Ayende Rahien

No, that is handled elsewhere


Gravatar

# re: Challenge: Find the bug 5/27/2009 12:48 PM Ayende Rahien

Dave,
Another table that we join against.


Gravatar

# re: Challenge: Find the bug 5/27/2009 12:50 PM Ayende Rahien

Gloubidou,
You are close...

Vadi,
No, that is not it.

Rafal,
I use it when I need a local DB. It is really easy to use when you get how it is working.
It helps that it is the only local DB with true threading support.


Gravatar

# re: Challenge: Find the bug 5/27/2009 12:53 PM Alexander

maybe issue is related with closure of the session and data variables here: v=>Api.JetDelete(session, data)


Gravatar

# re: Challenge: Find the bug 5/27/2009 12:55 PM Ayende Rahien

Alexander,
No, that is not it


Gravatar

# re: Challenge: Find the bug 5/27/2009 1:13 PM meo

Maybe "v => Api.JetDelete(session, data)" can't be run thru delegate for some reason?


Gravatar

# re: Challenge: Find the bug 5/27/2009 1:52 PM Ayende Rahien

meo,
No, that is not it. It works just fine.


Gravatar

# re: Challenge: Find the bug 5/27/2009 2:01 PM Stephen

I take it the crash is a stack overflow from an infinite loop and that you are either missing a breaking condition or not doing something that would should effect the 'position'..

Basically no idea, is this something that will probably only make sense when illustrated and explained?


Gravatar

# re: Challenge: Find the bug 5/27/2009 2:17 PM Gerke

What happens to the table cursor after Api.JetDelete()?
And does Api.JetDelete() immediately remove the table row or only mark it as deleted?

Otherwise scenario could be:
- Api.JetDelete moves cursor forward
- Api.TryMovePrevious moves cursor backwards (to same row)
=> infinite loop


Gravatar

# re: Challenge: Find the bug 5/27/2009 2:48 PM configurator

Stephen, That would hang, not crash...


Gravatar

# re: Challenge: Find the bug 5/27/2009 3:25 PM Stephen

Ah, true ;)

(btw ayende, having problems submitting comments from ie7/8- not sure if its just me, hitting submit doesn't do anything it just sits there).


Gravatar

# re: Challenge: Find the bug 5/27/2009 3:41 PM Ayende Rahien

Stephen,
No stack overflow, no, and not an infinite loop, but you should concentrate on the loop


Gravatar

# re: Challenge: Find the bug 5/27/2009 3:42 PM Ayende Rahien

Gerke,
It is marking that for deletion in the end of the current transaction, TryMovePrevious is not the issue.
The loop will terminate at the expected time.


Gravatar

# re: Challenge: Find the bug 5/27/2009 3:58 PM gloubidou

I guess that you are trying to clean expired values (stuff we don't care about anymore).. thus having a big result set would pose a problem... am I right?


Gravatar

# re: Challenge: Find the bug 5/27/2009 4:23 PM aaron

do we need to know the contract of the jet api to find the problem?


Gravatar

# re: Challenge: Find the bug 5/27/2009 5:23 PM DuvallBuck

If the Api.RetrieveColumnAsString gets the first one then the Api.TryMovePrevious would never go to the rest but stop when the first one is deleted.


Gravatar

# re: Challenge: Find the bug 5/27/2009 6:22 PM Ayende Rahien

gloubidou,
Tada, you are _very_ close.

aaron,
No, you don't.


Gravatar

# re: Challenge: Find the bug 5/27/2009 6:24 PM Ayende Rahien

DuvallBuck,
No, that is not the behavior we have


Gravatar

# re: Challenge: Find the bug 5/27/2009 7:38 PM Itamar

How often do values expire?


Gravatar

# re: Challenge: Find the bug 5/27/2009 7:53 PM Michael Morton

You're calling the function on a fixed timer and it's being executed again, before the previous execution has finished. Considering it results in a crash, it's probably stacking a good number of times.


Gravatar

# re: Challenge: Find the bug 5/27/2009 8:04 PM Ayende Rahien

That is a user defined value. In the system that we are talking abut, 24 hours


Gravatar

# re: Challenge: Find the bug 5/27/2009 8:05 PM Ayende Rahien

Michael,
Huh?
What timer? There isn't any external code involved that affect the bug in this function


Gravatar

# re: Challenge: Find the bug 5/27/2009 9:33 PM Sander Rijken

The call to Api.JetDelete should probably be called with session and key, not session and keyS (deleting everything)?


Gravatar

# re: Challenge: Find the bug 5/27/2009 9:59 PM Ayende Rahien

Sander,
keys is the variable holding the table name, not a collection. It is instruction to delete the current row.


Gravatar

# re: Challenge: Find the bug 5/27/2009 11:14 PM Paulo Köch

Two Api.JetDelete calls?


Gravatar

# re: Challenge: Find the bug 5/27/2009 11:18 PM Ayende Rahien

Everyone,
Notice what I said, what is the code that IS NOT THERE?!


Gravatar

# re: Challenge: Find the bug 5/27/2009 11:47 PM Neal Blomfield

You get the version but do not check it has expired?

TBH I am guessing based on what I would expect to see rather than understanding the code / api that this is using but it seems like the most logical thing.


Gravatar

# re: Challenge: Find the bug 5/28/2009 12:07 AM Ayende Rahien

Neal,
No, that is not that.
And you don't need to understand anything about the API to see the problem.
The problem is that until you see the problem, you don't know what it is.
I am actually encouraged that no one managed to find it.
It means that I am not that stupid


Gravatar

# re: Challenge: Find the bug 5/28/2009 12:11 AM gloubidou

Commit size will become huge in this case... Since you say that it is about code that isn't there, all I am able to come with is fixing a maximum of expired values to be processed in the loop and process the remaining later. That's similar to ORA-01555 snapshot too old!


Gravatar

# re: Challenge: Find the bug 5/28/2009 12:13 AM Ayende Rahien

gloubidou,
TADA! You got it!
When you hit it exactly right, you may get a LOT of expired values.
Those expired values can be big enough to hit the commit limit, and cause an error!
So I _was_ stupid, I thought so! :-)


Gravatar

# re: Challenge: Find the bug 5/28/2009 1:51 AM firefly

Wow... gloubidou I am impress.

Oren, how did you uncover the bug? If the application throw an exception somewhere then it's probably not that hard to trace down the culprit. Did you catch it through stress testing?


Gravatar

# re: Challenge: Find the bug 5/28/2009 3:07 AM configurator

Silly question: why is there a commit limit?


Gravatar

# re: Challenge: Find the bug 5/28/2009 3:28 AM Ayende Rahien

As I said, we run into this error in QA.
Once I had the stack trace, and the error, it was really obvious what was wrong

Comments have been closed on this topic.