Ayende @ Rahien

Refunds available at head office

Linq for NHibernate Adventures

So for the last few hours I have been getting back into the Linq for NHibernate project, after having left it for far too long. I am beginning to think that building a Linq provider might not have been the best way to learn C# 3.0, but never mind that now.

It is with great pride that I can tell you that the following query now works:

from e in db.Employees
select new { Name = e.FirstName + " " + e.LastName, Phone = e.HomePhone }

Why is this trivial query important? Well, it is important because about a year ago I said that there is only a single feature in Linq for SQL that NHibernate doesn't have. I also promised to fix it the moment that someone would tell me what it is. No one did, so it waited until today.

The main issue had to do with the Criteria API and handling parameters, no one ever needed to do that, it seems. When they did, they generally used HQL, which did have this feature. Since I have based Linq for NHibernate on the Criteria API*, that was a problem.

Now that ReSharper works on C# 3.0, I can actually get things done there, so I sat down and implemented it. It is surprisingly difficult issue, but I threw enough code on it until it gave up (I wonder if there is a name for nested visitors... ).

At any rate, I strongly recommend that you'll take a look at the project. And bug (fixes) and other patches are welcome**.

* This decision was important for two reasons, the first was that it is generally easier to use the Criteria API programmatically, and the second was that I wanted to ensure that the Crtieria API (which I favor) was a full featured as the HQL route.

** Yes, I know that I'll regret it.

Comments

Ray
02/16/2008 09:32 PM by
Ray

Good job mate!

Pete W
02/17/2008 03:03 AM by
Pete W

"nested visitors"...

I havent seen the code yet, but "nested visitor" gives me a vision of some sort of intermediary expression structure between linq and nahibernate expressions.

Ayende, I am very pleased to hear about the revival of this project. My last version of the trunk gave me failing tests. I presumed that everyone lost interest in this project, but I never figured out why.

Maybe now that resharper finally works with linq, we will see a "spike" of new interest with linq and the other C# 3.5 features.

Onur Gumus
02/17/2008 06:51 AM by
Onur Gumus

Is there a support for detached criteria's so that we can use it with Castle Active Record as well ? Or is there anyother way to integrate it with ActiveRecord rather than using ActiveRecordMediator ?

Ayende Rahien
02/17/2008 08:15 AM by
Ayende Rahien

Nested visitors refers to the a visitor that create another visitor (of the same type) to handle additional information.

But yes, it is definitely mapping between Linq concepts tor criteria concepts

Ayende Rahien
02/17/2008 08:17 AM by
Ayende Rahien

Onur,

I'll ensure that it will work with Active Record, have no fear in this regard :-)

Anonymous
02/17/2008 09:01 AM by
Anonymous

Will There be LINQ for ActiveRecord?

Ayende Rahien
02/17/2008 09:04 AM by
Ayende Rahien

Not likely.

There might be a bridge between Active Record and Linq for NHibernate, but I'll try to ensure that even this is not necessary.

Zoltan Hubai
02/17/2008 09:06 AM by
Zoltan Hubai

Thx for your great work. I'm just in learning linq, nhibernate, activerecord and doing some small test.

WIth Ken Egozis blog http://www.kenegozi.com/Blog/2007/11/18/activerecord-dot-linq-naive-but-working.aspx

I was able to use nhibernate.linq with activerecord.

I have a small problem, not sure if this should work:

        using (new SessionScope())

        {

            ActiverecordContext ac = ActiverecordContext.Instance();

            var q = from d in ac.Session.Linq<Country>() where d..Name.ToLower().StartsWith("a") select d;


     }

I'm always getting the following error:

Index was out of range. Must be non-negative and less than the size of the collection.

The error is occuring in QueryUtil.cs at line 174.

                names = GetMemberNames(((MethodCallExpression)expr).Arguments[0]);

expr is d.Name.ToLower().

Am I doing somthing wrong or this type of query shouldn't work?

Also is there a way to use a query more then one time? Like:

var q = from d in ac.Session.Linq() select d;

int count = q.Count();

List countries = q.ToList();

I was trying this but i got error.

Zoltan

Ayende Rahien
02/17/2008 09:08 AM by
Ayende Rahien

Zoltan,

I don't think that we support "d.Name.ToLower().StartsWith("a")" at the moment.

This is a bug.

What is the error you got there?

Zoltan Hubai
02/17/2008 09:15 AM by
Zoltan Hubai

The error that i'm getting when using d.Name.ToLower().StartsWith("a") ?

[ArgumentOutOfRangeException: Index was out of range. Must be non-negative and less than the size of the collection.

Parameter name: index]

System.ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument argument, ExceptionResource resource) +62

System.ThrowHelper.ThrowArgumentOutOfRangeException() +12

System.SZArrayHelper.get_Item(Int32 index) +2653952

System.Collections.ObjectModel.ReadOnlyCollection`1.get_Item(Int32 index) +50

NHibernate.Linq.QueryUtil.GetMemberNames(Expression expr) in H:\temp\svn\NHibernate.Linq\NHibernate.Linq\QueryUtil.cs:174

NHibernate.Linq.QueryUtil.GetMemberName(Expression expr) in H:\temp\svn\NHibernate.Linq\NHibernate.Linq\QueryUtil.cs:16

NHibernate.Linq.WhereArgumentsVisitor.GetLikeCriteria(MethodCallExpression expr, MatchMode matchMode) in H:\temp\svn\NHibernate.Linq\NHibernate.Linq\WhereArgumentsVisitor.cs:226

NHibernate.Linq.WhereArgumentsVisitor.VisitCallExpression(MethodCallExpression expr) in H:\temp\svn\NHibernate.Linq\NHibernate.Linq\WhereArgumentsVisitor.cs:61

There is more, not sure if I should post here

Ayende Rahien
02/17/2008 09:19 AM by
Ayende Rahien

Let us take this discussion to the rhino tools dev mailing list, ok?

Frans Bouma
02/17/2008 11:14 AM by
Frans Bouma

Mapping String.Concat calls to a db function isn't that hard. The fun begins when people start mixing in-memory calls with db calls in the projection. :) (Yes this is doable, try:

var q = from o in nw.Order

         select DateTime.DaysInMonth(o.OrderDate.Value.Year, o.OrderDate.Value.Month);

This is nasty for a couple of reasons:

1) you have multiple fields in the query which are used to produce a single end value

2) you have to create in-memory delegates which are called to produce the end result.

This is only solveable with an all-purpose approach, otherwise you'll run into problems when someone instantiates an in-memory object, passes values from the projection to the ctor and calls a method on it to obtain the Real value etc.

Why would you need a nested visitor? The only reason I needed a new visitor inside a pass (I use 6 passes over the tree, rewriting it in every pass) is to lookup things at that level by traversing a subtree, but only in some occasions. (I think 'handler' or 'crawler' is more appropriate. 'Visitor' suggests the visitor pattern is implemented by MS, which isn't the case, Expression doesn't have a Visit virtual method. :( )

Ayende Rahien
02/17/2008 11:39 AM by
Ayende Rahien

Frans,

The complexity was that NH didn't have the concept of doing this through the criteria API, not the Linq stuff itself.

Hibernate is really biased toward using HQL, which means that the Criteria API side has been functional, but not as full featured.

Broadly, Lind for NHibernate is using two major parts of NH, ICriterion and IProjection.

ICriterion is used for booleans, IProjection for selects.

The problem is that Linq mix them fairly freely, and that was never in the plan for Hibernate.

I added support for IProjection to consume ICriterion and I am working on making ICriterion consume IProjection now.

Fun stuff, and it significantly enhance NH criteria query abilities.

Of course, I still think that who ever designed Linq was mad.

The example that you gave is a good example.

There is no way to know where it is going to run, and that is a bad mojo.

What happen if it was:

var q = from o in nw.Order

where DateTime.DaysInMonth(o.OrderDate.Value.Year, o.OrderDate.Value.Month) == 30

select o;

This means that you have to load the ENTIRE TABLE to memory to do this.

Crazy, crazy, crazy.

I don't want something as trivial as that causing that much trouble.

And I built my own primitive visitor for that.

I think that the fact that they didn't provide an expression visitor is a flat out shame.

It is not like anyone else needs that, right?

Mats Helander
02/17/2008 11:48 AM by
Mats Helander

Frans,

You mean an abstract (not virtual) Visit method, right ? ;-)

But I'm not sure I understand this:

"otherwise you'll run into problems when someone instantiates an in-memory object, passes values from the projection to the ctor and calls a method on it to obtain the Real value etc. "

Could you elaborate?

/Mats

Frans Bouma
02/17/2008 02:18 PM by
Frans Bouma

"Broadly, Lind for NHibernate is using two major parts of NH, ICriterion and IProjection. ICriterion is used for booleans, IProjection for selects.

The problem is that Linq mix them fairly freely, and that was never in the plan for Hibernate."

Yes, you need projections which represent 'derived tables' (SQL term) a LOT. I had to add it to LLBLGen Pro as well to make things work.

"Of course, I still think that who ever designed Linq was mad."

haha :D I agree. Some stuff is OK, but other things are flat-out stupid. I mean: who came up with the lame idea of deferred execution of linq queries, but ONLY a part of the queries is deferred executed!

This one is:

var q = from c in nw.Customer select c;

but this one isn't:

var q = (from c in nw.Customer select c.Country).Contains("USA");

That last one is executed immediately...

"The example that you gave is a good example.

There is no way to know where it is going to run, and that is a bad mojo.

What happen if it was:

var q = from o in nw.Order

where DateTime.DaysInMonth(o.OrderDate.Value.Year, o.OrderDate.Value.Month) == 30

select o;

This means that you have to load the ENTIRE TABLE to memory to do this.

Crazy, crazy, crazy."

True, that's the problem. However IGNORING this isn't helping as people will want to execute it in the projection.

The trick is that you should have a generic way to map a call onto a database function. If such a mapping isn't found, you keep the call around. When handling the projection, you have support for Call and MemberAccess, all other places you don't. It then ends up in tears in an exception, what you want in this case. This still isn't fail proof though, a nested select in a join branch for example with a method call in the projection will cause problems.

"And I built my own primitive visitor for that.

I think that the fact that they didn't provide an expression visitor is a flat out shame. It is not like anyone else needs that, right?"

Matt Warren made one available on his blog: http://blogs.msdn.com/mattwar

It's the same as the one inside the .NET framework (which is internal. Joy..)

It has some flaky routines though, so you better write your own (it's fairly straightforward). I peeked into your code this morning and I saw you use a different approach than I do: you try to handle everything at once instead of re-writing the tree element by element. This is cumbersome, as with joins for example (groupjoin etc.) you need to refer to parts of the tree already processed, so calling out into different handlers isn't going to cut it: you need one big handler to merge everything together (which handles a tree which is pre-processed a couple of times by rewriting elements.)

@Mats: I meant a method which the visitor calls by passing itself to it :). Yes, abstract is fine, virtual doesn't make sense indeed, as you have to override it in all cases indeed.

"Could you elaborate"

var q = from c in nw.Customers

        select new Foo(c.Country, c.City).GetSomeValue();

I'm not completely done with this scenario, my 'new' handler finds this a projection to new Foo instances, which isn't the case: it's a list of resultvalues from GetSomeValue(). I'm not sure if this is doable though, as it's pretty tough to distinguish if it's a list of Foo's, or a list of resultvalues from GetSomeValue().

I don't think it's a common scenario, but it illustrates the point. ;). (Haven't tried it if linq to sql can handle this though ;))

Mats Helander
02/18/2008 04:52 AM by
Mats Helander

Frans,

I just tried the following example using LINQ to SQL:

var q = from c in db.Customers select new Foo(c.Country, c.City).GetSomeValues();

with the following implementation of Foo:

public class Foo

{

public Foo(string country, string city)

{

    this.country = country;

    this.city = city;

}


private string country = "";

private string city = "";


public string GetSomeValues()

{

    return country + "_" + city;

}

}

It resulted in the following SQL:

SELECT [t0].[Country] AS [country], [t0].[City] AS [city]

FROM [dbo].[Customers] AS [t0]

And the following output (I had one customer):

Mr_Doki

I'm still not sure what the problem is, exactly?

Yes, in this case the operations inside GetSomeValues() could in theory have been transformable to SQL and executable by the database, so that all records in the table wouldn't have to become loaded into memory...is that what you are refering to ? Because while this happens to be true in this particular case that the operation could be turned into SQL, it wouldn't be true in the general case.

/Mats

Mats Helander
02/18/2008 05:13 AM by
Mats Helander

@Ayende

"Of course, I still think that who ever designed Linq was mad."

LOL - you say that like it was a BAD thing! :-P

@Frans (more),

"This one is:

var q = from c in nw.Customer select c;

but this one isn't:

var q = (from c in nw.Customer select c.Country).Contains("USA");

That last one is executed immediately..."

Well, your query is a l2o (linq to objects) query using the results of a l2s query as its source of objects. Since l2o queries are executed directly, the observed behavior seems to make sense? The inner l2s query will be deferred until someone executes it, but since that someone is the outer l2o query, it will be executed directly.

"It then ends up in tears in an exception, what you want in this case."

I don't agree. If I write:

var q = from o in nw.Order

where DateTime.DaysInMonth(o.OrderDate.Value.Year, o.OrderDate.Value.Month) == 30

select o;

Then that (what I just wrote, in code) is what I want to happen. It will come as no surprise to me if I write that statement that the whole table will be loaded, as that is what I have fairly explicitly asked for. Throwing an exeption to inform me that I'm doing what I know I am doing and then refusing to do it doesn't seem helpful?

/Mats

Ayende Rahien
02/18/2008 05:32 AM by
Ayende Rahien

Mats,

"Then that (what I just wrote, in code) is what I want to happen. It will come as no surprise to me if I write that statement that the whole table will be loaded, as that is what I have fairly explicitly asked for. Throwing an exeption to inform me that I'm doing what I know I am doing and then refusing to do it doesn't seem helpful?"

There are two things going on there. The first is the technical feasibility of this. The second is the gross violation of the principal of least surprise.

You really can't look at this statement and tell me what it does:

var q = from o in nw.Order

where DateTime.DaysInMonth(o.OrderDate.Value.Year, o.OrderDate.Value.Month) == 30

select o;

For Linq to NHibernate, we have decided that all DB methods are extension methods of IDbMethods interface, which make it easier to distinguish between in memory and in DB methods.

This is important because you really want this kind of thing to be clearly visible for you.

Frans Bouma
02/18/2008 09:11 AM by
Frans Bouma

@Mats:

"I'm still not sure what the problem is, exactly?

Yes, in this case the operations inside GetSomeValues() could in theory have been transformable to SQL and executable by the database, so that all records in the table wouldn't have to become loaded into memory...is that what you are refering to ? Because while this happens to be true in this particular case that the operation could be turned into SQL, it wouldn't be true in the general case."

The problem is that the query feeds data to a delegate which is executed on the raw resultset coming from the db and the RESULT of that delegate is the result value for each row.

EVERY linq provider has to implement this scenario, otherwise the query you tested won't work at all, you'll get a crash somewhere, as the methodcall to GetSomeValues is inside the expression tree. You can't ignore it, you've to implement code to execute it.

So it's:

  • generate SQL to produce the input values for the in-memory delegate you're going to execute in the projection engine

  • projection engine applies delegate onto input to produce the projection results.

"Well, your query is a l2o (linq to objects) query using the results of a l2s query as its source of objects. Since l2o queries are executed directly, the observed behavior seems to make sense? The inner l2s query will be deferred until someone executes it, but since that someone is the outer l2o query, it will be executed directly."

No it's not! It's a DB query! :) It results in something like:

SELECT CASE WHEN NOT EXISTS (.... ) THEN 1 ELSE 0 END FROM <...>

NONE of the queries I posted executes ANY linq to objects code. None. That's the hard part of writing a linq provider: you get an expression tree, you have to convert EVERY bit to sql, otherwise the WHOLE query will fail.

An exception is the stuff which can be converted to in-memory code, like:

var q = from c in nw.Customer

        where new string[] { "USA", "Germany"}.Where(x=>x.StartsWith("U").Contains(c.Country)

        select c;

here, the new string[] { "USA", "Germany"}.Where(x=>x.StartsWith("U") part is an in-memory construct. You can find these with a funcletizer (do a google search, you'll find the 3 entries about it and the code) and compile it into a delegate.

"

I don't agree. If I write:

var q = from o in nw.Order

where DateTime.DaysInMonth(o.OrderDate.Value.Year, o.OrderDate.Value.Month) == 30

select o;

Then that (what I just wrote, in code) is what I want to happen. It will come as no surprise to me if I write that statement that the whole table will be loaded, as that is what I have fairly explicitly asked for. Throwing an exeption to inform me that I'm doing what I know I am doing and then refusing to do it doesn't seem helpful?"

Good luck with that. It can't be done. The problem is: you need results of the in-memory query INSIDE the db! Check:

var q = (from o in nw.Order

where DateTime.DaysInMonth(o.OrderDate.Value.Year, o.OrderDate.Value.Month) == 30

select o) join c in nw.Customer on o.CustomerID = c.CustomerID

select c;

you need the in-memory query result back in the db. You can imagine that it's possible to create a query where you need to pass back-forth multiple times resultsets to be able to produce the results (if applicable at all). This is not doable.

This is the weak side of Linq: the developer can tie things together which actually can't be tied together. In Linq to objects I can group on boolean expression results, in the DB I can't. So the same linq query can't run on the DB. For a developer it's not obvious why this is.

Carlos
02/18/2008 01:00 PM by
Carlos

Im new to .net, but all this remember me ( a Déjà vu?) to ms access and access sql with linked tables (and passthrow sql).

Frans Bouma
02/19/2008 05:19 PM by
Frans Bouma

I solved this btw:

var q = from o in nw.Order

where DateTime.DaysInMonth(o.OrderDate.Value.Year, o.OrderDate.Value.Month) == 30

select o;

with a custom mapping. See my latest blogpost. The SQL expression is horrible to look at, but who cares :)

Jimmy Bogard
03/17/2008 06:08 PM by
Jimmy Bogard

Just took a look at the source - looks like the solution file needs to be changed for VS 2008 instead of the Orcas beta.

Comments have been closed on this topic.