Ayende @ Rahien

Refunds available at head office

You see that database? OFF WITH HIS HEAD!

image A while ago I was chatting with a friend that complained about a migration script timing out on his local machine. When I looked at the script, it was fairly obvious what was wrong:

DECLARE @VideoID UNIQUEIDENTIFIER
DECLARE @NewID UNIQUEIDENTIFIER

DECLARE VideoCursor CURSOR READ_ONLY
FOR
SELECT ID FROM Video

OPEN VideoCursor

FETCH NEXT FROM VideoCursor
INTO @VideoID

WHILE @@FETCH_STATUS = 0
BEGIN
    SET @NewID = NEWID()
   
    INSERT INTO Content (ID, Body, FormatBody, Plain)
        SELECT @NewID, ContentItem.Body, Video.FormatBody, Video.Plain
        FROM ContentItem
        INNER JOIN Video
        ON Video.Id=ContentItem.ID
        WHERE Video.Id=@VideoID
   
    UPDATE Video SET ContentId=@NewID WHERE Video.Id=@VideoID
       
    UPDATE ThumbImage SET ContentId=@NewID WHERE Video_id=@VideoID
   
    FETCH NEXT FROM VideoCursor
    INTO @VideoID
END

The script was using a cursor!

Every time you want a use a cursor, you must fast for three days while reading the memoires of Edgar F. Codd.

Cursors are evil!

Let us see how we can make this work using set based logic, shall we?

INSERT INTO #TempContent 
SELECT newid() as NewId, Video.Id as OldId, ContentItem.Body, Video.FormatBody, Video.Plain
FROM ContentItem
INNER JOIN Video
ON Video.Id=ContentItem.ID
WHERE Video.Id=@VideoID

INSERT INTO Content(ID, Body, FormatBody, Plain)
SELECT NewId, ContentItem.Body, Video.FormatBody, Video.Plain

UPDATE Video 
SET ContentId=NewId 
FROM #TempContent
WHERE Video.Id=OldId

UPDATE ThumbImage 
SET ContentId=NewId 
FROM #TempContent
WHERE Video.Id=OldId

DROP TABLE #TempContent

I can assure you that this will work faster, read better, get parallelize by the database and in generally be better behaved than the previous version.

Comments

Marc Gravell
10/30/2009 10:38 AM by
Marc Gravell

Just to highlight other evils of cursors...

I was asked a question recently about a bug with a cursor that was updating the source table, causing the cursor to run forever. I could vaguely remember that there are some flags (when you declare the cursor) that control this behaviour - but in all honesty I have no interest in learning the details of those flags. I agree as with the above; the best answer is to ditch the cursor completely. Either in-place, or using a temp-table/table-variable (depending on size).

In some ways, things like ROW_NUMBER() provide answers for some of the other excuses that people use for cursors (of course, you can use an auto-incrementing identity in a temp table for the same purpose).

Andrey Shchekin
10/30/2009 10:51 AM by
Andrey Shchekin

I haven't seen any cursor-only solutions, except for schema-modifying operations.

When I worked on projects where there was no O/R at all, just a lot of stored procedures, even then there was no excuse to using a cursor and project guidelines outlawed them.

deyanp
10/30/2009 10:53 AM by
deyanp

I fully agree with your solution and with the statement that cursors are evil, BUT my experience says also that the usual causes for bad performance are missing/wrong indexes or transaction/locking issues (including lack of read uncommitted on selections) ...

chrissie1
10/30/2009 11:29 AM by
chrissie1

Of course you are right.

But apparently this is not true for Oracle, but I'm no expert.

Frans Bouma
10/30/2009 12:02 PM by
Frans Bouma

he also could have enlarged the tempdb, into which sqlserver inserts tempresults for cursors.

Cursors are in general a sign of imperative thinking, so one should always strive for set-based statements. However cursors do have their place, though the # of cases is limited.

eledu81
10/30/2009 12:06 PM by
eledu81

I think the original code lacks of a commit fase(e.g. each 1k rows)... with large data the REDO log (or whatever is called in MSSQL) may be causing the timeout..

Your example is very simple, sometimes you need some calculations for each row so you don't have any other option than using cursors..

Ayende Rahien
10/30/2009 12:08 PM by
Ayende Rahien

eledu81,

Making calcs per each row is easy with SET based logic

e
10/30/2009 12:11 PM by
e

WHERE Video.Id=@VideoID shouldn't be there in the non-cursor example.

eledu81
10/30/2009 01:09 PM by
eledu81

Yes, you can do SET, but if you have a more complex logic there like selecting from more than one table or need conditionals for calculating values, transforming data, etc you will end with a gigantic SELECT/INSERT statement or multiple statements in the same table that I think will not perform better than using cursors.

Also you will lose control, if some row have bad data the whole operation will rollback and you can't know what row was the problem, adding exception handling and logging is easy in the original code.

configurator
10/30/2009 01:33 PM by
configurator

Are you saying there is absolutely no reason whatsoever to ever even think about using cursors?

Ayende Rahien
10/30/2009 01:35 PM by
Ayende Rahien

I am saying that there are, but they are FAR rarer than the actual usage of cursors.

Set
10/30/2009 02:07 PM by
Set

I've a SP with not one nor two cursors but 5...all 5 are useless...fun...

Dennis
10/30/2009 02:39 PM by
Dennis

I lost count of how many times I've had to say "Dont do that with a cursor" :(

But it is HARD sometimes to figure out how to do something in one query, but on the other hand it made whomever I told a lot better at the next queries they made.

JeffG
10/30/2009 03:50 PM by
JeffG

deyanp mentions locking issues being a big source of performance pains. I know this is slightly off the original topic here, but I was wondering what some strategies are when performing batch updates on live databases, that may have locks.

Al
10/30/2009 05:03 PM by
Al

I think it depends on your database platform. I basically never use cursors in MSSQL. I use cursors occasionally in Oracle.

John Farrell
10/30/2009 05:12 PM by
John Farrell

@configurator

Cursors are one of those "no-bullet" things. They are never a good solution and its always possible to do something without a cursor.

Michael Morton
10/30/2009 06:16 PM by
Michael Morton

@John Farrell

I don't agree. They are simply another tool in the toolbox and there are some situations where they will be the right tool to use. The key is knowing when to use them. Though I will say that if you don't know if "now" is the right time to use them, it's not the right time to use them.

Tobin Harris
10/30/2009 11:11 PM by
Tobin Harris

I'm a fashion junky, cursors are soooo 1990s ;)

Janne Majaranta
10/31/2009 12:21 AM by
Janne Majaranta

Temp-tables are so 2k0.

When you don't need indexes, you can use table variables.

declare @mytable table (uniqueidentifier id, stuff nvarchar(4000))

insert into @mytable .......

;)

RichB
10/31/2009 09:02 AM by
RichB

@configurator>Are you saying there is absolutely no reason whatsoever to ever even think about using cursors?

@Ayende>I am saying that there are, but they are FAR rarer than the actual usage of cursors.

Which is ironic since NH's Pit of Success is cursor based operations.

Frank
10/31/2009 04:27 PM by
Frank

RichB, can you clarify that comment? I don't see how I am supposed to be doing cursor based operations when using NHibernate.

RichB
11/01/2009 08:50 PM by
RichB

@frank:

foreach(Video in session.CreateQuery("from Video")) {

session.Save(new Content {....});

}

That's not set-based. It's imperative.

Dan Howard
11/02/2009 05:53 PM by
Dan Howard

If cursors are so bad how am I supposed to know where my mouse pointer is?

Andrew
11/02/2009 10:17 PM by
Andrew

How about switching to a FAST_FOWARD cursor in these cases rather than remake one as a temp table. And if you are going to use a temp table, use a table variable. Just replace the # with @, mostly.

Everyone always says how terrible cursors are, but when I write one correctly no one can find a faster way to do the same work. Not yet, at least, and I don't write them when they aren't useful or substantially clearer code. It doesn't look like the one here is necessarily the best way to do it, but it sure looks like the cleanest.

Also, try this pattern for cursor acesss, since it removes the redundant/confusing initial access and consolidates the variable setting (but replaces with a bit odd 1=1 loop)

WHILE (1=1) BEGIN

FETCH NEXT FROM cursorName INTO @variable

IF @@FETCH_STATUS <> 0 BREAK


.. do stuff

END

END

alwin
11/03/2009 11:33 AM by
alwin

@RichB:

Maybe that is imperative, but then again, when doing that you are in imperative C# land. Not in the trenches of set-based SQL.

Eldergriffon
11/05/2009 01:03 AM by
Eldergriffon

Some observations

  • Astounded to hear that someone using cursors mainly for logical clarity does not find they are typically slower than set-based operations. I totally agree with Ayende. The rule in our shop is "if you use a cursor, you need to be able to show that nothing else will do." The reason is simple. Overwhelming experience is that cursors (even "FAST_FORWARD" ones) are much slower than set-based equivalents in SQL Server.

  • Table variables can be preferable to temp tables, but remember one thing: the optimizer does not have any statistics on the contents of a table variable. This means that if the table variable has a lot of rows, some very demented query plans can arise, particularly if it's being used in joins against large tables.

Janus Knudsen
11/13/2009 11:46 PM by
Janus Knudsen

I must disagree with many - sorry to say...

I'm however totally in agreement with the statement "Any set-based logic should be executed in a cursor", in fact.. If you cannot write set-based queries you should look for another job asap or ask somebody willing to educate you.

But but... Cursors are indeed very nice to iterate over a resultset and call stored procedures, start/ stop agent jobs etc. of course such operations rarely iterates over 100K rows.

Did you know that all replication technologies depends 100% at cursors? Don't be shocked :)

Saying cursors are evil, should be said on the right context!

Janus Knudsen
11/15/2009 02:16 PM by
Janus Knudsen

Errata: "Any set-based logic should be executed in a cursor" should of course be "Any set-based logic should NOT be executed in a cursor"

Comments have been closed on this topic.