Syntax: Multi Something
As I have already explained, I am doing a lot of work with NHibernate's MultiCriteria and MutliQuery. There are very powerful, but they are also mean that I am working at a level that has a lot of power, but a bit of a complex syntax. I want to improve that, but I am not sure what the best way to do it. Anything here is blog-code, meaning that I didn't even verified that it has valid syntax. It is just some ideas about how this can go, I am looking for feedback.
The idea here is to have a better way to use NHQG expressions, and to remove the need to manually correlate between the index of the added query and the index in the result set. It should also give you better syntax for queries that return a unique result.
new CriteriaQueryBatch() .Add(Where.Post.User.Name == "Ayende", OrderBy.Post.PublishedDate.Desc) .Paging(0, 10) .OnRead(delegate(ICollection<Post> posts) { PropertyBag["posts"] = posts; }) .Add(Where.Post.User.Name == "Ayende") .Count() .OnRead(delegate(int count) { PropertyBag["countOfPosts"] = count; }) .Execute();
Waiting for you thoughts...
Comments
I really like the fact that you get rid of the explicit indexing.
Well, I have to say...the syntax you have proposed is actually quite to my liking :)
I have just written a way too long blog post (as usual) that has some relevance to multi querying.
http://www.matshelander.com/wordpress/?p=50
If anyone with experience in this type of thing could tell me why my thinking is wrong it would be hugely appreciated, since this is something I have been struggling with for years...
/Mats
pretty cool syntax, but it would be even cooler if you could do it like this:
ICollection<Post> posts;
int count;
new CriteriaQueryBatch()
.Add(Where.Post.User.Name == "Ayende", OrderBy.Post.PublishedDate.Desc)
.Paging(0, 10)
.StoreIn(ref posts)
.Add(Where.Post.User.Name == "Ayende")
.Count()
.StoreIn(ref count)
.Execute();
not sure if it's actually possible though
@ Ayende,
Pulling the meat from my blog post...do you know if NHibernate suffers from the same problem, that if I do:
Select * From Customer Where Id = 42;
Select * From Order Where Customer.Id = 42;
Select * From OrderLine Where Order.Customer.Id = 42;
(I'm using NPath syntax since I'm not familiar enough with HQL, but I'm hoping you'll see what the statements mean)
...then this will not prevent lazy loading to occur when I access the customer.Orders property and the order.OrderLines properties?
In NPersist, the order.Customer and orderLine.Order properties will be loaded after running these three queries, but the customer.Orders property and the order.OrderLines properties will not be.
Is that the same in NHibernate?
/Mats
+1 anything that makes it easier to use is always awesome. -d
This looks cool! I think the paging and count queries are common enough however that they deserve their own methods:
new CriteriaQueryBatch()
This way you only have to specify the critieria once!
Bobby
Great ideea. I would really like the comination of StoreInto and FindPage, so
.SelectPaged(0, 10, Where....).Into( ref ICollection<Post>, ref int)
.Select(Where...).Into(ref ICollection<Post>);
IMO the ideea of a query is to get a value not to execute some function with the results.
Imagine if I need to write a try/catch in that anonymous method or even if I need to use the results in order to make another such query :). I would have to use some local variables that will be set from within the anounymous function call.
For me a similar approach hurt more when I needed to do smth like this:
class Foo {
bool TryToGet(ref ICollection results)
{
}
}
This will not work because ref or out parameters cannot be used inside anounymous function. So here goes the alternative:
class Foo {
bool TryToGet(ref ICollection results)
{
}
}
... and this is not very nice
Davy,
Technically, this is possible, but it is something that would require pinning the memory and using unsafe code, not really a good idea.
Bobby,
Good idea, I assume that FindPage will automatically create the second query, right?
Yes, the FindPage would create two queries under the hood. What do you think of this syntax, for C# 3.0:
new CriteriaQueryBatch()
and for C# 2.0 would be:
new CriteriaQueryBatch()
so the developer doesn't forget to include the OnRead callback?
One drawback, however, is that you would need to have several FindPage overloads instead of the OnRead overloads to accept the different delegates.
Thanks,
Bobby
Bobby,
The idea is that I want to be able to skip the reading part, I am using this for lazy loading stuff as well.
Keeping it flexible with a few shorter signature methods seems like a better idea to me.
new CriteriaQueryBatch()
.Add( Where.Post.User.Name == "Ayende", OrderBy.Post.PublishedDate.Desc )
.Paged(0, 10)
.OnRead(delegate(ICollection<Post> posts, int count) { PropertyBag["posts"] = posts; PropertyBag["countOfPosts"] = count; })
.Execute();
Where the call to Paged(int, int) would know to create the count query under the hood. Like people have said above.
I also think there might be something interesting that could be done with changing the type of the CriteriaQueryBatch as you build it.
ICollection<Post> posts;
int postCount;
new CriteriaQueryBatch()
.Add( Where.Post.User.Name == "Ayende", OrderBy.Post.PublishedDate.Desc ) // returns a CriteriaQueryBatch<Post>
.Paged(0, 10) // returns a CriteriaQueryBatch<Post, int>
.ExecuteInto(out posts, out postCount);
This would kinda be a bit of a pain to maintain and I am not sure it is worth the slightly cleaner syntax.
I also think that there should be some way of explicitly storing the results in a IDictionary<String, object>.
something like :
new CriteriaQueryBatch()
// ... build it
.ExecuteInto(PropertyBag, "posts", "postCount");
Joe,
For this, I can just use a class like this:
CountPagedQuery<Post> query= new CountPagedQuery<Post>()
.Add(Where.Post.User == "Ayende")
.Order(OrderBy.Post.Date.Desc)
.Execute();
query.Results
query.Count
The syntax in the post is for difference queries.
The StoreInto() syntax won't work but there's no reason you couldn't attach a storage slot to a criteria to form a query. I like this better anyways as I wouldn't really want to see 20 queries all being attached in a single statement. It would make documentation a real pain (where to put comments or intermediate variables?).
IQuery<Post> postQuery = new Query(Where.Post.User.Name == "Ayende", OrderBy.Post.PublishedDate.Desc)
.Paging(0, 10)
.OnRead(delegate(ICollection<Post> posts) { PropertyBag["posts"] = posts; }));
IQuery<int> countQuery = new Query(Where.Post.User.Name == "Ayende")
.Count()
.OnRead(delegate(int count) { PropertyBag["countOfPosts"] = count; }));
MultiQuery.Execute(postQuery, countQuery);
PropertyBag["posts"] = postQuery.Result;
PropertyBag["postsCount"] = countQuery.Result;
Argh... my example should have omitted the OnRead callback. Posted too soon. Sorry!
Jeff, that is a lot to write, no?
Why not use the delegates?
And obviously you can break it into multiply statements
@Ayende
It'll depend a lot on what you intend to do with the results. I don't mind assigning each query to its own variable. It'll be the same number of statements as if I individually Add() them to a CriteriaQueryBatch. It'll be one less anonymous delegate if I just want to assign the results to a variable anyways.
In your case the delegate makes a lot of sense because you're just shoving the results into your Controller's PropertyBag without any further interpretation going on. It'll get messy if you need to take the results of two queries and combine them in some way.
I do like how the delegate allows you to decouple consumption of the query results from the place where the query is eventually executed. That can be quite useful... :-)
I'm curious whether LINQ syntax can be embedded nicely into a MultiCriteria-type query. Seems tricky... How would you gather all of the queries in a batch so that they are evaluated all at once instead of lazily as each the resulting IEnumerable<T> eventually gets traversed?
The broad idea here is have two stages process, the first is the query building, the second is processing the query.
The examples I am showing is mostly about shoving things to the UI, but I am using this for business processing as well.
You hit the jackpot with regard to consumptions vs. execution, but the idea here is to have a three stage process, query building, execution and consumption. Multiply parties can take part in building the query, which is what I am driving here.
I'll do a separate post about MultiLinq.
You know, this reminds me of non-blocking I/O patterns involving the Unix select() function call. Or, at the extreme of some things that are done using continuation passing style to centralize an event loop or some other dispatcher-type mechanism.
Here we've got services that are returning query objects to be batch-evaluated. It's presumes that the services might carry on and do other things with the results when they become available. Bit of a pain to work with but it can cut down on latency.
An equally good approach would be to parallelize the queries. Perform processing in multiple worker threads where possible. Given relatively low contention in the thread pool and Db connection pool, the total latency will be reduced even more than is possible with a batch query.
The goal here is to reduce latency per request on a web application, so yes, they would do other work when the results become available.
I wonder if this is something that can be cemented into an interface, something like:
public interface IServiceThatWantsData
{
}
public interface IPartialQuery
{
DetachedCriteria GetQuery();
void ProcessResults(IList results);
}
Hm, reading the syntax, that is fairly awkward, but something of this approach.
An interesting approach there would be to execute the query in an async manner, that should give even more scalability.
Parallelizing the queries would mean more requests to the database, no? The latency may be reduced, but the system load per request would be increased.
Not quite good enough. What if your service wants to return some final result? A better approach would be to follow the async token pattern more closely.
public interface IAuthorizationManager
{
}
public interface IQueryContinuation
{
}
But this is rather clunky since the caller needs to manage all of these continuation tokens, etc... This is the same level of awkwardness seen when working with asynchronous methods a lot. Not good. Moreover, batching up queries for execution is just one of the many possible things you might want to batch up.
This approach is also violating the encapsulation of the service because the caller probably shouldn't need to know that there's a bunch of querying going on. There's a difference between a service that returns a prototypical DetachedCriteria for you to mess around with and one whose guts are hanging out.
I'm not sure how badly parallelizing the queries would impact the database. Clearly more connections would be active at a time and more requests would be under way concurrently. Quite likely each request would individually take longer to run so lock contention would become more of a problem. The whole batch is likely to finish sooner though... It'll certainly squeeze the database for bandwidth...
This is an interesting example that you give.
Because it demonstrate where batching queries breaks down.
What is we fail authorization? We shouldn't do any more queries for the invalid user.
Further more, what if we need the user id for further processing (to show the associated customers), so we can't really do it this way. We would have to perform two queries.
@Mats,
Yes, the customer.Orders would not be loaded, for the same reasons that you have mentioned in the post.
@ Ayende,
Thanks a million!
/Mats
Comment preview