NHibernate Futures
One of the nicest new features in NHibernate 2.1 is the Future<T>() and FutureValue<T>() functions. They essentially function as a way to defer query execution to a later date, at which point NHibernate will have more information about what the application is supposed to do, and optimize for it accordingly. This build on an existing feature of NHibernate, Multi Queries, but does so in a way that is easy to use and almost seamless.
Let us take a look at the following piece of code:
using (var s = sf.OpenSession()) using (var tx = s.BeginTransaction()) { var blogs = s.CreateCriteria<Blog>() .SetMaxResults(30) .List<Blog>(); var countOfBlogs = s.CreateCriteria<Blog>() .SetProjection(Projections.Count(Projections.Id())) .UniqueResult<int>(); Console.WriteLine("Number of blogs: {0}", countOfBlogs); foreach (var blog in blogs) { Console.WriteLine(blog.Title); } tx.Commit(); }
This code would generate two queries to the database:
Two queries to the database is a expensive, we can see that it took us 114ms to get the data from the database. We can do better than that, let us tell NHibernate that it is free to do the optimization in any way that it likes, I have marked the changes in red:
using (var s = sf.OpenSession()) using (var tx = s.BeginTransaction()) { var blogs = s.CreateCriteria<Blog>() .SetMaxResults(30) .Future<Blog>(); var countOfBlogs = s.CreateCriteria<Blog>() .SetProjection(Projections.Count(Projections.Id())) .FutureValue<int>(); Console.WriteLine("Number of blogs: {0}", countOfBlogs.Value); foreach (var blog in blogs) { Console.WriteLine(blog.Title); } tx.Commit(); }
Now, we seem a different result:
Instead of going to the database twice, we only go once, with both queries at once. The speed difference is quite dramatic, 80 ms instead of 114 ms, so we saved about 30% of the total data access time and a total of 34 ms.
To make things even more interesting, it gets better the more queries that you use. Let us take the following scenario. We want to show the front page of a blogging site, which should have:
- A grid that allow us to page through the blogs.
- Most recent posts.
- All categories
- All tags
- Total number of comments
- Total number of posts
For right now, we will ignore caching, and just look at the queries that we need to handle. I think that you can agree that this is not an unreasonable amount of data items to want to show on the main page. For that matter, just look at this page, and you can probably see as much data items or more.
Here is the code using the Future options:
using (var s = sf.OpenSession()) using (var tx = s.BeginTransaction()) { var blogs = s.CreateCriteria<Blog>() .SetMaxResults(30) .Future<Blog>(); var posts = s.CreateCriteria<Post>() .AddOrder(Order.Desc("PostedAt")) .SetMaxResults(10) .Future<Post>(); var tags = s.CreateCriteria<Tag>() .AddOrder(Order.Asc("Name")) .Future<Tag>(); var countOfPosts = s.CreateCriteria<Post>() .SetProjection(Projections.Count(Projections.Id())) .FutureValue<int>(); var countOfBlogs = s.CreateCriteria<Blog>() .SetProjection(Projections.Count(Projections.Id())) .FutureValue<int>(); var countOfComments = s.CreateCriteria<Comment>() .SetProjection(Projections.Count(Projections.Id())) .FutureValue<int>(); Console.WriteLine("Number of blogs: {0}", countOfBlogs.Value); Console.WriteLine("Listing of blogs"); foreach (var blog in blogs) { Console.WriteLine(blog.Title); } Console.WriteLine("Number of posts: {0}", countOfPosts.Value); Console.WriteLine("Number of comments: {0}", countOfComments.Value); Console.WriteLine("Recent posts"); foreach (var post in posts) { Console.WriteLine(post.Title); } Console.WriteLine("All tags"); foreach (var tag in tags) { Console.WriteLine(tag.Name); } tx.Commit(); }
This generates the following:
And the actual SQL that is sent to the database is:
SELECT top 30 this_.Id as Id5_0_, this_.Title as Title5_0_, this_.Subtitle as Subtitle5_0_, this_.AllowsComments as AllowsCo4_5_0_, this_.CreatedAt as CreatedAt5_0_ FROM Blogs this_ SELECT top 10 this_.Id as Id7_0_, this_.Title as Title7_0_, this_.Text as Text7_0_, this_.PostedAt as PostedAt7_0_, this_.BlogId as BlogId7_0_, this_.UserId as UserId7_0_ FROM Posts this_ ORDER BY this_.PostedAt desc SELECT this_.Id as Id4_0_, this_.Name as Name4_0_, this_.ItemId as ItemId4_0_, this_.ItemType as ItemType4_0_ FROM Tags this_ ORDER BY this_.Name asc SELECT count(this_.Id) as y0_ FROM Posts this_ SELECT count(this_.Id) as y0_ FROM Blogs this_ SELECT count(this_.Id) as y0_ FROM Comments this_
That is great, but what would happen if we would use List and UniqueResult instead of Future and FutureValue?
I’ll not show the code, since I think it is pretty obvious how it will look like, but this is the result:
Now it takes 348ms to execute vs. 259ms using the Future pattern.
It is still in the 25% – 30% speed increase, but take note about the difference in time. Before, we saved 34 ms. Now, we saved 89 ms.
Those are pretty significant numbers, and those are against a very small database that I am running locally, against a database that is on another machine, the results would have been even more dramatic.
Comments
So other than the change of interface, is there any difference in using this than what we've been doing up to now with MultiCriteria?
Jason,
No, not really.
But Future will automatically fail over to use standard queries if the feature is not supported on the database you are using
Excellent Ayende!
I will be trying this out at lunch!
Ninja
What databases support it?
Currently?
MS Sql Server
MySQL
PostGreSQL
SQLite
Microsoft SQL CE
Thats cool, I guess the Future call essentially registers itself in a set associated to that session and when you first try and get the value of any of the futures, it triggers nhibernate to multiquery the lot?
I always wondered if this was possible more passively, like if you had a declarative binding to the data (like xaml for example) you could infer the same deferred execution.. but my interest was purely just to see if you could, I think theres some inherant flaws with the concept.
Stephen,
Yes, that is how it basically works.
About declarative, you can do it, for the very simple stuff, but it does't really work for the real world
ayende, was it implemented using c# expression?
there's some problems by design to not simply use always multicriteria that internally will use "future" implementation if the currently used db allows it?
Junior,
Not following, but probably the answer is no.
Kork,
Future is simpler, much simpler.
what i mean is that for me it's more easy and clean from an api poin of view that client code will always use multicriteria, so if later there's a change from/to a db that support "future" feature there's no need to change the code becouse the switch between future/multicriteria is made internally by nhibernate code.
my question was if that was not done becouse it was too hard or impossible to introduce in current nhiebernate code or if it is so only becouse you and other nh developers prefer this api (a.k.a. it's only for nh developers taste :) ).
MutiCrietria / MultiQuery will throw if you are not supporting the feature
Future is a way to utilize this feature if it is needed.
I was going to ask the same as korkl. Is there any reason why nhibernate doesn't do it transparently (when supported)? Or, put in other words, is there any scenario you would not want to use this feature?
A clear explanation of a great feature. Thanks Ayende.
Alberto & Kork,
NHibernate tries hard not to make too much magic.
Is it even possible to do this transparently? for example returning an int wouldn't be possible to defer, the Future instance is important as it lets you declare 'interest' in a value before actually wanting it.
Ayende, is it possible to use a future in another query? like I could get a future of int, and use that as a value in another query?
(I'm guessing not).
Stephen,
NHibernate also supply a FutureValue() method, which you can use to get an int sometimes in the future.
And no, you cannot use a future inside another query (but you can used a detached criteria inside another criteria)
versus setting a default batch size?
Cali,
Batch size is for updates, futures or multi query is for reads
Very nice feature! I see lots of scenerio's in a code base where I didn't make use of multi criteria (as I was more experimenting with nhibernate back then) and this will make it very easy to refactor that code for optimalisation.
Comment preview