Ayende @ Rahien

It's a girl

SQL Gotchas

I think you can imagine the amount of paint involved in having a query behave in an unexpected manner. I have run into both of those recently. This one had me doubting my sanity (imagine this on a table with several hundred thousands records, with a fairly complex query around it:

select 1

where 1 not in (2,3,null)

And then there is this interesting little query:

select 1 where 'e' = 'e   '

I refused to believe the result until I saw it myself.

Comments

Petar Repac
10/28/2007 09:53 PM by
Petar Repac

I assume that you were surprised that first select did not return any rows. It makes perfect sense if your database works with NULLS by the ANSI SQL rules. Under those rules comparing X to NULL is nor true nor false, it's unknown. So if I rewrite your first statement like this:

select 1 from MY_TABLE where ((1 2) AND (1 3) AND (1 null))

then the last comparision if unknown (and interpreted as false) . You can also try:

select 1 from MY_TABLE where (1 = null ) or (1 null)

and you will still get 0 rows. I tested on Oracle. Think that SQL Server by default works with nulls a little bit differently.

Regards, Petar

Ayende Rahien
10/28/2007 09:56 PM by
Ayende Rahien

Petar,

I would expect it if I was thinking about it using hard coded, I would have expected it to work differently for in and not in.

It makes sense that it will need to maintain the same behavior, it just that it sucks, that is all.

And that behavior is on SQL Server as well

J. Philip
10/28/2007 10:40 PM by
J. Philip

This is maybe on case whee LINQ comes in handy where these constant expressions are evaluated on the client:

First query in LINQ to SQL:

Clients.Select(i => 1).Where(i => "a" == "a ")

no result, translates to:

SELECT [t1].[value]

FROM (

SELECT @p0 AS [value]

FROM [Clients] AS [t0]

) AS [t1]

WHERE @p1 = 1

-- @p0: Input Int32 (Size = 0; Prec = 0; Scale = 0) [1]

-- @p1: Input Int32 (Size = 0; Prec = 0; Scale = 0) [0]

Query 2:

Clients.Select(i => 1).Where(i => !(new List{2, 3, null}.Contains(1)))

return all rows, translates to

SELECT @p0 AS [value]

FROM [Clients] AS [t0]

-- @p0: Input Int32 (Size = 0; Prec = 0; Scale = 0) [1]

Mats Helander
10/28/2007 11:37 PM by
Mats Helander

Interesting...I wonder if that is really the correct way for linq2sql to handle it?

/Mats

Chad Myers
10/29/2007 02:35 AM by
Chad Myers

Dear SQL (and T-SQL and PL-SQL):

DIAF.

KKTHX

Bill
10/29/2007 02:42 AM by
Bill

You can use the SQL LIKE operator (without any percentage signs) to perform an exact match.

Petar Repac
10/29/2007 08:00 AM by
Petar Repac

I would expect it if I was thinking about it using hard coded, I would have expected it to work differently for in and not in.

I'm afraid I did not understand exactly what you mean by that. You are not generating that ad-hoc SQL in runtime and than send it to db, I hope ?

Kyle Baley
10/29/2007 11:34 AM by
Kyle Baley

The first one I can understand because of the funky NULL definition. Hard to justify the second one. Especially when SELECT 1 WHERE 'e' = ' e' works "as expected".

Tobin Harris
10/29/2007 12:19 PM by
Tobin Harris

Hehe. I had exactly the same problem, and blogged about it a few months back...

http://www.tobinharris.com/2007/7/11/select-aaaargh-where-pain-in-null

Like an arrogant fool, I almost filed a bug :)

The problem is this:

"What's happening here is that a null is neither equal or unequal to anything, therefore when we ask if 1 is NOT in (null,2,3), the database can't say that's strictly true"

Petar Repac
10/29/2007 01:47 PM by
Petar Repac

The second select is not hard to understand if we know that (on Oracle):

1) Text literals are treated as CHAR datatypes and

2) when comparing CHAR datatype trailing spaces are ignored.

So 'e' = 'e ' actually evaluates to true.

There is nothing strange about this. It's just the way database works.

Petar

Tim B
10/29/2007 02:43 PM by
Tim B

Here's one I ran into at my last gig:

SELECT 1

FROM TBL_Empty

WHERE 1 = 2

And yes, TBL_Empty really was empty... not that it would matter.

Ayende Rahien
10/29/2007 02:54 PM by
Ayende Rahien

Petar ,

Of course I am generating it in runtime :-)

More seriously, I meant that the real query was:

select * from source

where id not id (select parent from child)

Avish
10/29/2007 07:41 PM by
Avish

Ayende: on MSSQL, try SET ANSI NULLS OFF. This makes all the comparisons involving nulls to behave as you'd expect them to, so you can do "Something = null" and "Something <> null" to mean "Something IS null" and "Something IS NOT null".

Comes in extremely handy with parameters:

SET ANSI NULLS OFF

SELECT * FROM Table WHERE ParentID = @parentId

-- @parentId can be null, which would yield all "orphaned" rows

Petar: You say "There is nothing strange about this. It's just the way database works". Of course there's something strange about this. The fact that this is the way the DB works is irrelevant - it is contrary to intuition and stands in contrast to every other programming language on the planet. That, my friend, is exactly what I call "strange".

The fact that you use "=" for a kinda-like-it comparison, but using the "LIKE" keyword gives you an exact match - well, that's just plain stupid.

Chris Wuestefeld
10/29/2007 09:42 PM by
Chris Wuestefeld

I knew all about the first one already, but the 2nd one really made me say "hmmmm". The weird thing is that when I copy-and-paste the SQL code from my newsreader, it returns a 1, suggesting that a string is equal to the same string with appended spaces. What?

Upon closer examination, stranger things appeared:

  1. A string that I append spaces to manually does, in fact, evaluate as being equality. I assume that this has something to do with converting between fixed-length CHARs and varchars.

  2. The string that I copy from Firefox behaves exactly as I describe above.

  3. The string that I copy from my newsreader evaluates them as NOT equal, returning no rows. A bit more investigation shows that the padding character copied out of my newsreader isn't an ascii 32, but is rather an ascii 160, a non-breaking space. Go figure.

Avish
10/29/2007 09:54 PM by
Avish

ASCII has non-breaking space? That's news.

Petar Repac
10/29/2007 11:04 PM by
Petar Repac

Avish: Try this:

static void Main(string[] args) {

  Console.WriteLine(1 == 0.99999999999999999);

}

Is this strange ? Is it contrary to intuition ?

Equals operator is dependent on its operators type. The fact that symbol "=" is used for many different equals operators doesn't matter. Comparing numbers is different than comparing CHARs or VARCHARs or matrices or complex objects. Or strings in C#.

Vitaly
10/30/2007 03:24 AM by
Vitaly

I would imagine that 'e' = 'e ' returns true to make char and varchar comparisons "seamless" since char fields typically have whitespace padding if the value doesn't occupy the fixed length. If the above condition were to evaluate to false, then comparing CHARs and VARCHARs would be syntactically brutal (would need casts all over the place).

Chris May
10/30/2007 11:26 PM by
Chris May

See this will teach me to not stay current with your blog posts.

I ran into this EXACT problem the day after you posted this, and I ended up writing my own post on the topic.

http://www.chrismay.org/2007/10/29/SQL+NOT+IN+Will+Fail+If+The+List+Contains+A+Null.aspx

Mats Helander
11/03/2007 07:11 PM by
Mats Helander

Avish,

"it is contrary to intuition and stands in contrast to every other programming language on the planet. "

Be as it may with programming languages, I don't agree about your argument about intuition.

Null means that you don't know the answer to the question.

Say that I have compiled three different lists, each list containing the names of, say, all the congressmen and each list representing something compromising. By the name of each congressman, on each list, I can mark "yes (they have done it)", "no (they haven't done it)" or "null (unknown if they have done it)" - that is, we use a nullable boolean by each name.

Now, say my republican sponsor asked of me: Give me those lists where it is known that no republicans have done the compromising thing, so I can use it in my campaign.

What if I then gave him the lists where the repuclican names had null values? Later in the campaign, it would turn out - once the value for the bit did become known - that they had indeed done whatever it was and it would be embarrassing for the campaign.

If the sponsor had asked me: Give me those lists where no republican names are known to have done the compromising thing, then I could have given him the lists with the nulls, but not to satisfy his original request, which is the request that translates to what you ask of the database with the SQL query in question.

/Mats

Comments have been closed on this topic.