Ayende @ Rahien

Refunds available at head office

Vienna Beers night

A bunch of us will have beers in Vienna on Tuesday (5th Mai) at about 18:30 and would welcome you to join us!

I also don't know if telavivbeach has opened, but i know that "Hermes Strandbar" has opened. If the weather is fine you can find us there:

ttp://www.strandbarherrmann.at/ or if it's cold/rainy at Bar Vulcania located here: http://tinyurl.com/dcq4t7

Anybody is welcome to join us.

Thanks for Christoph for setting this up.

NHibernate – The difference between Get, Load and querying by id

One of the more common mistakes that I see people doing with NHibernate is related to how they are loading entities by the primary key. This is because there are important differences between the three options.

The most common mistake that I see is using a query to load by id. in particular when using Linq for NHibernate.

var customer = (
	select customer from s.Linq<Customer>()
	where customer.Id = customerId
	select customer
	).FirstOrDefault();

Every time that I see something like that, I wince a little inside. The reason for that is quite simple. This is doing a query by primary key. The key word here is a query.

This means that we have to hit the database in order to get a result for this query. Unless you are using the query cache (which by default you won’t), this force a query on the database, bypassing both the first level identity map and the second level cache.

Get  and Load are here for a reason, they provide a way to get an entity by primary key. That is important for several aspects, most importantly, it means that NHibernate can apply quite a few optimizations for this process.

But there is another side to that, there is a significant (and subtle) difference between Get and Load.

Load will never return null. It will always return an entity or throw an exception. Because that is the contract that we have we it, it is permissible for Load to not hit the database when you call it, it is free to return a proxy instead.

Why is this useful? Well, if you know that the value exist in the database, and you don’t want to pay the extra select to have that, but you want to get that value so we can add that reference to an object, you can use Load to do so:

s.Save(
	new Order
	{
		Amount = amount,
		customer = s.Load<Customer>(1)
	}
);

The code above will not result in a select to the database, but when we commit the transaction, we will set the CustomerID column to 1. This is how NHibernate maintain the OO facade when giving you the same optimization benefits of working directly with the low level API.

Get, however, is different. Get will return null if the object does not exist. Since this is its contract, it must return either the entity or null, so it cannot give you a proxy if the entity is not known to exist. Get will usually result in a select against the database, but it will check the session cache and the 2nd level cache first to get the values first.

So, next time that you need to get some entity by its primary key, just remember the differences…

NHibernate IPreUpdateEventListener & IPreInsertEventListener

NHibernate’s listeners architecture bring with it a lot of power to the game, but understanding how to use it some of the listeners properly may require some additional knowledge. In this post, I want to talk specifically about the pre update hooks that NHibernate provides.

Those allow us to execute our custom logic before the update / insert is sent to the database. On the face of it, it seems like a trivial task, but there are some subtleties that we need to consider when we use them.

Those hooks run awfully late in the processing pipeline, that is part of what make them so useful, but because they run so late, when we use them, we have to be aware to what we are doing with them and how it impacts the rest of the application.

Those two interface define only one method each:

bool OnPreUpdate(PreUpdateEvent @event) and bool OnPreInsert(PreInsertEvent @event), respectively.

Each of those accept an event parameter, which looks like this:

image

Notice that we have two representations of the entity in the event parameter. One is the entity instance, located in the Entity property, but the second is the dehydrated entity state, which is located in the State property.

In NHibernate, when we talk about the state of an entity we usually mean the values that we loaded or saved from the database, not the entity instance itself. Indeed, the State property is an array that contains the parameters that we will push into the ADO.Net Command that will be executed as soon as the event listener finish running.

Updating the state array is a little bit annoying, since we have to go through the persister to find appropriate index in the state array, but that is easy enough.

Here comes the subtlety, however. We cannot just update the entity state. The reason for that is quite simple, the entity state was extracted from the entity and place in the entity state, any change that we make to the entity state would not be reflected in the entity itself. That may cause the database row and the entity instance to go out of sync, and make cause a whole bunch of really nasty problems that you wouldn’t know where to begin debugging.

You have to update both the entity and the entity state in these two event listeners (this is not necessarily the case in other listeners, by the way). Here is a simple example of using these event listeners:

public class AuditEventListener : IPreUpdateEventListener, IPreInsertEventListener
{
	public bool OnPreUpdate(PreUpdateEvent @event)
	{
		var audit = @event.Entity as IHaveAuditInformation;
		if (audit == null)
			return false;

		var time = DateTime.Now;
		var name = WindowsIdentity.GetCurrent().Name;

		Set(@event.Persister, @event.State, "UpdatedAt", time);
		Set(@event.Persister, @event.State, "UpdatedBy", name);

		audit.UpdatedAt = time;
		audit.UpdatedBy = name;

		return false;
	}

	public bool OnPreInsert(PreInsertEvent @event)
	{
		var audit = @event.Entity as IHaveAuditInformation;
		if (audit == null)
			return false;


		var time = DateTime.Now;
		var name = WindowsIdentity.GetCurrent().Name;

		Set(@event.Persister, @event.State, "CreatedAt", time);
		Set(@event.Persister, @event.State, "UpdatedAt", time);
		Set(@event.Persister, @event.State, "CreatedBy", name);
		Set(@event.Persister, @event.State, "UpdatedBy", name);

		audit.CreatedAt = time;
		audit.CreatedBy = name;
		audit.UpdatedAt = time;
		audit.UpdatedBy = name;

		return false;
	}

	private void Set(IEntityPersister persister, object[] state, string propertyName, object value)
	{
		var index = Array.IndexOf(persister.PropertyNames, propertyName);
		if (index == -1)
			return;
		state[index] = value;
	}
}

And the result is pretty neat, I must say.

Let us burn all those pesky Util & Common libraries

image

This is a post that is riling against things like Rhino Commons, MyCompany.Util, YourCompany.Shared.

The reason for that, and the reason that I am not longer making direct use of Rhino Commons in my new projects is quite simple.

Cohesion:

In computer programming, cohesion is a measure of how strongly-related and focused the various responsibilities of a software module are. Cohesion is an ordinal type of measurementand is usually expressed as "high cohesion" or "low cohesion" when being discussed. Modules with high cohesion tend to be preferable because high cohesion is associated with several desirable traits of software including robustness, reliability, reusability, and understandability whereas low cohesion is associated with undesirable traits such as being difficult to maintain, difficult to test, difficult to reuse, and even difficult to understand.

I am going to rip into Rhino Commons for a while, let us look at how many things it can do:

  1. Create SQL CE databases dynamically
  2. Keep track of the performance of ASP.Net requests
  3. Log to a database in an async manner using bulk insert
  4. Log to a collection of strings
  5. Log to an embedded database – with strict size limits
    1. Same thing for SQLite
    2. Same thing for SqlCE
  6. Keep track of desirable properties of the http request for the log
  7. Add more configuration options to Windsor
  8. Provide a static gateway to the Windsor Container
    1. Plus some util methods
  9. Allow to execute Boo code as part of an msbuild script
  10. Allow the execute a set of SQL scripts as part of an msbuild script
  11. Provide cancelable thread pool
  12. Provide an implementation of a thread safe queue
  13. Provide an implementation of a countdown latch (threading primitive)
  14. Expose a SqlCommandSet that is internal in the BCL so we can use it
  15. Allow to easily record log messages executes in a given piece of code
  16. Allow to get the time a piece of code executed with high degree of accuracy
  17. Allow to bulk delete a lot of rows from the database efficiently
  18. Provide an easy way to read XML based on an XPath
  19. Provide a way to update XML based on an XPath
  20. Provide a configuration DSL for Windsor
  21. Provide local data store that works both in standard code and in ASP.Net scenarios
  22. Provide collection util methods
  23. Provide date time util methods
  24. Provide disposable actions semantics
  25. Provide generic event args class
  26. Allow 32 bit process to access 64 bit registry
  27. Give nice syntax for indexed properties
  28. Give a nice syntax for static reflection
  29. Provide guard methods for validating arguments
  30. Provide guard methods for validating arguments – No, that is not a mistake, there are actually two different and similar implementations of that there
  31. Provide a very simple data access layer based on IDbConnection.
  32. Provide a way to query NHibernate by for a many to one using its id with a nicer syntax
  33. Provide a named expression query for NHibernate (I am not sure what we are doing that for)
  34. Provide unit of work semantics for NHibernate
  35. Provide transaction semantics for auto transaction management using the previously mentioned unit of work
  36. Provide a way to map an interface to an entity and tie it to the Repository implementation
  37. Provide a fairly complex in memory test base class for testing Container and database code
  38. Provide a way to warn you when SELECT N+1 occur in your code via http module
  39. Provide nicer semantics for using MultiCriteria with NHibernate
  40. Provide Future queries support externally to NHibernate
  41. Provide an IsA expression for NHibernate
  42. Provide a way to execute an In statement using XML selection (efficient for _large_ number of queries)
  43. Provide a pretty comprehensive generic Repository implementation, including a static gateway
  44. Provide an easy way to correctly implement caching
  45. Provide a way to easily implement auto transaction management without proxies
  46. Do much the same for Active Record as well as NHibernate

Well, I think that you get the drift by now. Rhino Commons has been the garbage bin for anything that I came up with for a long time.

It is easy to get to that point, but just not paying attention. In fact, we are pretty much doctrined into doing just that, with “reuse, reuse, reuse” bang into our head so often.

The problem with that?

Well, most of this code is only applicable for just one problem, in one context, in one project. Bulk Deleter is a good example, I needed it for one project, 3 years ago, and never since. The repository & unit of work stuff has been used across many projects, but what the hell do they have to do with a configuration dsl? Or with static reflection?

As a matter of fact, that Rhino Commons has two (different) ways to do parameter validation is a pretty good indication of a problem. The mere fact that we tend to have things like Util, Shared or Common is an indication that we are basically throwing unrelated concerns together.  It actually get worse if we have something in the common project that can be used for multiple projects. A good example of that would be the in memory database tests that Rhino Commons provide. I have used it in several projects, but you know what? It is freaking complex.

The post about rolling your own in memory test base class with NHibernate show you how simple it can be. The problem is that as timed went by, we wanted more & more functionality out of the rhino commons implementation, container integration, support for multiple databases, etc. And as we did, we piled more complexity on top of it. To the point where it is easier to roll your own than to use the ready made one. Too many requirements for one piece of code == complexity. And complexity is usually a bad thing.

The other side of the problem is that we are going to end up with a lot of much smaller projects, focused on doing just one thing. For example, a project for extending NHibernate’s querying capabilities, or a project to extend log4net.

Hm, low coupling & high cohesion, I heard that somewhere before…

Post scheduling

This is a general announcement about a change in the way that I am posting to this blog.

One of the more frequent feedback items about the blog was that people find it hard to catch up with my rate of posting. This is especially true since I tend to spend some days posting a large number of posts, and I feel that the sheer quantity reduce the amount of time people dedicate to each post (hence reducing its quality).

I have started making use of future posting to a high degree (almost all of the NHibernate mapping posts were written in a day or two, for example, but spaced over about a month). I don’t really try to keep any sort of organization, except that I am going to try to keep the maximum number of posts per day to no more than two. Each new post is just going to the back of the queue, and will be posted then.

Currently I have scheduled posts all the way to mid May, but I think it will get higher. This is good news in the sense that you are almost always going to get at least one post per day from me, but it does mean that sometimes posts that are written together are stretched over a period of time. Or I may refer (usually in comments) to posts that will be posted in the future.

There is no real meaning behind the timing of the posts, unless there is something special that happens in this date, so you may leave the conspiracy theories to rest :-) .

NH Prof feedback

Every now and then I am running a search to see what people are thinking about NH Prof. And I decided that this time it might be a good idea to share the results found on Twitter. I am still looking for more testimonials, by the way.

image

You are welcome :-)

image

I am not sure if that is that much of a good thing, though…

image

I certainly agree :-)

image

Should I worry about it or use it as a marketing channel? Remind me something from a Batman movie.

image

Go for it :-D

image

Thanks.

image

Yeah!

image

Wait until you see some of the bug reports that I am seeing…

image

That is the point.

image

Hm, I am not sure that I am happy to be the center of a life changing event, but as long as it is positive…

image

That was a damn hard feature to write, too.

image

What can I say, I agree.

And assuming that you got to this point in the post, I am doing a lot of work on NH Prof recently, getting it ready to v1.0. As a reminder, when I release v1.0, the reduced beta pricing is going away…

NHibernate Unit Testing

When using NHibernate we generally want to test only three things, that properties are persisted, that cascade works as expected and that queries return the correct result. In order to do all of those, we generally have to talk to a real database, trying to fake any of those at this level is futile and going to be very complicated.

We can either use a standard RDBMS or use an in memory database such as SQLite in order to get very speedy tests.

I have a pretty big implementation of a base class for unit testing NHibernate in Rhino Commons, but that has so many features that I forget how to use it sometimes. Most of those features, by the way, are now null & void because we have NH Prof, and can easily see what is going on without resorting to the SQL Profiler.

At any rate, here is a very simple implementation of that base class, which gives us the ability to execute NHibernate tests in memory.

public class InMemoryDatabaseTest : IDisposable
{
	private static Configuration Configuration;
	private static ISessionFactory SessionFactory;
	protected ISession session;

	public InMemoryDatabaseTest(Assembly assemblyContainingMapping)
	{
		if (Configuration == null)
		{
			Configuration = new Configuration()
				.SetProperty(Environment.ReleaseConnections,"on_close")
				.SetProperty(Environment.Dialect, typeof (SQLiteDialect).AssemblyQualifiedName)
				.SetProperty(Environment.ConnectionDriver, typeof(SQLite20Driver).AssemblyQualifiedName)
				.SetProperty(Environment.ConnectionString, "data source=:memory:")
				.SetProperty(Environment.ProxyFactoryFactoryClass, typeof (ProxyFactoryFactory).AssemblyQualifiedName)
				.AddAssembly(assemblyContainingMapping);

			SessionFactory = Configuration.BuildSessionFactory();
		}

		session = SessionFactory.OpenSession();

		new SchemaExport(Configuration).Execute(true, true, false, true, session.Connection, Console.Out);
	}

	public void Dispose()
	{
		session.Dispose();
	}
}

This just set up the in memory database, the mappings, and create a session which we can now use. Here is how we use this base class:

public class BlogTestFixture : InMemoryDatabaseTest
{
	public BlogTestFixture() : base(typeof(Blog).Assembly)
	{
	}

	[Fact]
	public void CanSaveAndLoadBlog()
	{
		object id;

		using (var tx = session.BeginTransaction())
		{
			id = session.Save(new Blog
			{
				AllowsComments = true,
				CreatedAt = new DateTime(2000,1,1),
				Subtitle = "Hello",
				Title = "World",
			});

			tx.Commit();
		}

		session.Clear();


		using (var tx = session.BeginTransaction())
		{
			var blog = session.Get<Blog>(id);

			Assert.Equal(new DateTime(2000, 1, 1), blog.CreatedAt);
			Assert.Equal("Hello", blog.Subtitle);
			Assert.Equal("World", blog.Title);
			Assert.True(blog.AllowsComments);

			tx.Commit();
		}
	}
}

Pretty simple, ah?

Relying on hash code implementation is BAD – part II

To be truthful, I never thought that I would have a following for this post 4 years later, but I run into that today.

The following is a part of an integration test for NH Prof:

Assert.AreEqual(47, alerts[new StatementAlert(new NHProfDispatcher())
{
	Title = "SELECT N+1"
}]);

I am reviewing all our tests now, and I nearly choked on that one. I mean, who was stupid enough to write code like this?  I mean, yes, I can understand what it is doing, sort of, but only because I have a dawning sense of horror when looking at it.

I immediately decided that the miscreant that wrote that piece of code should be publically humiliated and  chewed on by a large dog.

SVN Blame is a wonderful thing, isn’t it?

image

Hm… there is a problem here.

Actually, there are a couple of problems here. One is that we have a pretty clear indication that we have a historic artifact here. Just look at the number of version that are shown in just this small blame window. This is good enough reason to start doing full fledged ancestory inspection. The test has started life as:

[TestFixture]
public class AggregatedAlerts:IntegrationTestBase
{
	[Test]
	public void Can_get_aggregated_alerts_from_model()
	{
		ExecuteScenarioInDifferentAppDomain<Scenarios.ExecutingTooManyQueries>();

		var alerts = observer.Model.Sessions[1].AggregatedAlerts;
		Assert.AreEqual(47, alerts["SELECT N+1"]);
		Assert.AreEqual(21, alerts["Too many database calls per session"]);
	}
}

Which I think is reasonable enough. Unfortunately, it looks like somewhere along the way, someone had taken the big hammer approach to this. The code now looks like this:

Assert.AreEqual(47, alerts.First(x => x.Key.Title == "SELECT N+1"));

Now this is readable.

Oh, for the nitpickers, using hash code evaluation as the basis of any sort of logic is wrong. That is the point of this post. It is a non obvious side affect that will byte* you in the ass.

* intentional misspelling

It is either a traveling schedule or a train wreck…

So, after a delicious month at home, mostly spent resting I am back to traveling again. This time my traveling schedule is so complex that I need to write it down.

Next week (04 May – 08 May), I am going to be in Vienna, Austria. Doing some work for a client. If someone want to organize a beer night, or something similar, I open for that :-)

The week after that (11 May – 13 May), I am going to take part in Progressive.NET tutorials in London, UK. There are quite a few good speakers there, doing 12 half day workshops. I am going to be talking about NHibernate, doing an Intro to NHibernate and an Advanced NHibernate workshop. This is basically trying to squeeze my 3 days NHibernate course into one half day workshop (Intro) and then just directly to the advanced NHibernate (which goes beyond the stuff that I am talking about in both the workshop and the course).

The week after that, (18 May – 20 May), I am giving the first NHibernate Course (London, UK). This is a three days course that is going to take you from no NHibernate knowledge to having a pretty good knowledge in NHibernate, including understanding all the knobs and how NHibernate thinks. This course is already full, but have no fear, because the weak after that… I am giving the same course again (again in London, UK).

This time the dates are 26 May – 28 May, and there are booking available there.

You might have noticed that this means that I am going to spend three weeks in London. There is already a beer night planned, thanks to Sebastian. A side affect of this schedule is that I am available in the holes between the scheduled workshops and courses. This means that if you are in London and would like me to have a short engagement in 14 May – 15 May or 21 May – 25 May, please drop me a line.

Next, 29 May – 7 June I am spending in New Jersey, working. I might catch up the ALT.Net meeting there again, I seem to make a habit of that.

Following that, 8 June – 12 June, it is time for DevTeach, my favorite conference on that side of the Atlantic.

image

I think that early bird registration is still open, so hurry up. I am going to talk about:

  • Advanced IoC
  • OR/M += 2 – my advanced NHibernate workshop, squeezed into a single session
  • Producing production quality software
  • Writing Domain Specific Languages in Boo

And as usual for DevTeach, I am going to need a lot of coffee just to keep up with the content in the sessions and the fun happening in the hallways and in the bars. And as a pleasant bonus, ALT.Net Canada is going to follow up directly after DevTeach, so on 13 June – 14 June I am going to be there.

Wait, it is not over yet!

17 June – 19 June, Norwegian Developers Conference! Just from my interactions with the conference organizer I can feel that it is going to be a good one. I mean, just look at the speakers lineup.

image

For that matter, you might want to do better than that and check the unofficial NDC 2009 video:

image

If a conference has that much energy before it is even started, you can pretty much conclude that it is going to be good.

Here I am talking about building Multi Tenant Applications, Producing Production Quality Software and Writing Domain Specific Languages in Boo.

And as long as I am in Norway, why not give another NHibernate course (Oslo, Norway)? This one is on the 22 June – 24 June,

By that time, assuming that I will survive all of that, I am heading home, and giving myself a course in hibernating, because I am pretty sure that this is what I would have to do to get my strength back.

NHibernate Futures

One of the nicest new features in NHibernate 2.1 is the Future<T>() and FutureValue<T>() functions. They essentially function as a way to defer query execution to a later date, at which point NHibernate will have more information about what the application is supposed to do, and optimize for it accordingly. This build on an existing feature of NHibernate, Multi Queries, but does so in a way that is easy to use and almost seamless.

Let us take a look at the following piece of code:

using (var s = sf.OpenSession())
using (var tx = s.BeginTransaction())
{
	var blogs = s.CreateCriteria<Blog>()
		.SetMaxResults(30)
		.List<Blog>();
	var countOfBlogs = s.CreateCriteria<Blog>()
		.SetProjection(Projections.Count(Projections.Id()))
		.UniqueResult<int>();

	Console.WriteLine("Number of blogs: {0}", countOfBlogs);
	foreach (var blog in blogs)
	{
		Console.WriteLine(blog.Title);
	}

	tx.Commit();
}

This code would generate two queries to the database:

image

image

image

Two queries to the database is a expensive, we can see that it took us 114ms to get the data from the database. We can do better than that, let us tell NHibernate that it is free to do the optimization in any way that it likes, I have marked the changes in red:

using (var s = sf.OpenSession())
using (var tx = s.BeginTransaction())
{
	var blogs = s.CreateCriteria<Blog>()
		.SetMaxResults(30)
		.Future<Blog>();
	var countOfBlogs = s.CreateCriteria<Blog>()
		.SetProjection(Projections.Count(Projections.Id()))
		.FutureValue<int>();

	Console.WriteLine("Number of blogs: {0}", countOfBlogs.Value);
	foreach (var blog in blogs)
	{
		Console.WriteLine(blog.Title);
	}

	tx.Commit();
}

Now, we seem a different result:

image

image

Instead of going to the database twice, we only go once, with both queries at once. The speed difference is quite dramatic, 80 ms instead of 114 ms, so we saved about 30% of the total data access time and a total of 34 ms.

To make things even more interesting, it gets better the more queries that you use. Let us take the following scenario. We want to show the front page of a blogging site, which should have:

  • A grid that allow us to page through the blogs.
  • Most recent posts.
  • All categories
  • All tags
  • Total number of comments
  • Total number of posts

For right now, we will ignore caching, and just look at the queries that we need to handle. I think that you can agree that this is not an unreasonable amount of data items to want to show on the main page. For that matter, just look at this page, and you can probably see as much data items or more.

Here is the code using the Future options:

using (var s = sf.OpenSession())
using (var tx = s.BeginTransaction())
{
	var blogs = s.CreateCriteria<Blog>()
		.SetMaxResults(30)
		.Future<Blog>();

	var posts = s.CreateCriteria<Post>()
		.AddOrder(Order.Desc("PostedAt"))
		.SetMaxResults(10)
		.Future<Post>();

	var tags = s.CreateCriteria<Tag>()
		.AddOrder(Order.Asc("Name"))
		.Future<Tag>();

	var countOfPosts = s.CreateCriteria<Post>()
		.SetProjection(Projections.Count(Projections.Id()))
		.FutureValue<int>();

	var countOfBlogs = s.CreateCriteria<Blog>()
		.SetProjection(Projections.Count(Projections.Id()))
		.FutureValue<int>();

	var countOfComments = s.CreateCriteria<Comment>()
		.SetProjection(Projections.Count(Projections.Id()))
		.FutureValue<int>();

	Console.WriteLine("Number of blogs: {0}", countOfBlogs.Value);

	Console.WriteLine("Listing of blogs");
	foreach (var blog in blogs)
	{
		Console.WriteLine(blog.Title);
	}

	Console.WriteLine("Number of posts: {0}", countOfPosts.Value);
	Console.WriteLine("Number of comments: {0}", countOfComments.Value);
	Console.WriteLine("Recent posts");
	foreach (var post in posts)
	{
		Console.WriteLine(post.Title);
	}

	Console.WriteLine("All tags");
	foreach (var tag in tags)
	{
		Console.WriteLine(tag.Name);
	}

	tx.Commit();
}

This generates the following:

image

And the actual SQL that is sent to the database is:

SELECT top 30 this_.Id             as Id5_0_,
              this_.Title          as Title5_0_,
              this_.Subtitle       as Subtitle5_0_,
              this_.AllowsComments as AllowsCo4_5_0_,
              this_.CreatedAt      as CreatedAt5_0_
FROM   Blogs this_
SELECT   top 10 this_.Id       as Id7_0_,
                this_.Title    as Title7_0_,
                this_.Text     as Text7_0_,
                this_.PostedAt as PostedAt7_0_,
                this_.BlogId   as BlogId7_0_,
                this_.UserId   as UserId7_0_
FROM     Posts this_
ORDER BY this_.PostedAt desc
SELECT   this_.Id       as Id4_0_,
         this_.Name     as Name4_0_,
         this_.ItemId   as ItemId4_0_,
         this_.ItemType as ItemType4_0_
FROM     Tags this_
ORDER BY this_.Name asc
SELECT count(this_.Id) as y0_
FROM   Posts this_
SELECT count(this_.Id) as y0_
FROM   Blogs this_
SELECT count(this_.Id) as y0_
FROM   Comments this_

That is great, but what would happen if we would use List and UniqueResult instead of Future and FutureValue?

I’ll not show the code, since I think it is pretty obvious how it will look like, but this is the result:

image

Now it takes 348ms to execute vs. 259ms using the Future pattern.

It is still in the 25% – 30% speed increase, but take note about the difference in time. Before, we saved 34 ms. Now, we saved 89 ms.

Those are pretty significant numbers, and those are against a very small database that I am running locally, against a database that is on another machine, the results would have been even more dramatic.

The Repository’s Daughter

Keeping up with the undead theme, this post is a response to Greg’s. I’ll just jump into the parts that I disagree with:

The boundary is not arbitrary or artificial. The boundary comes back to the reasons we were actually creating a domain model in the first place. it seems what Oren is actually arguing against is not whether “advances in ORMs” have changed things but that he questions the isolation at all. The whole point of the separation is to remove such details from our thinking when we deal with the domain and to make explicit the boundaries around the domain and the contracts of those boundaries.

As I understand Greg’s interpretation of my points, I agree. For quite a few needs, there is no need to create an explicit boundary between the persistence medium and the code. Transparent lazy loading and persistence by reachability allow us to hand the entire problem to the infrastructure. The two things that we have to worry about it controlling the fetch paths and making sure that we aren’t doing stupid things like calling the database in a loop.

Those things are the responsibilities of the controllers layer (not necessarily an MVC controller, by the way, I am talking about the highest level in the app that isn’t actually about presentation concerns).

If we take Oren’s advice, we can store our data anywhere … so long as it looks and acts like a database. If that is not the case then oops we have to either

  • Make it look and act like a full database
  • Scrap our code that treated it as such and go back to the explicit route.

Just to be clear on this point … He has baked in fetch paths, full criteria, and ordering into his Query objects so any new implementation would also have to support all of those things.Tell me how do you do this when you are getting your data now from an explicit service?

Well, duh! Of course they would need that. We need to be able to do that to be able to execute business logic. Going back to the example that I gave in the previous post, “Add Product” and “Charge Order” have drastically different data requirements, how are you going to handle that without having support for fetch paths?

The last statement there is quite telling, I think. I thought that my previous post made it clear that I am advocating doing this inside a service boundary. The problem that Greg is trying to state doesn’t exist since you don’t do that across a service boundary.

Its not just about YAGNI its about risk management. We make the decision early (Port/Adapter) that will allow us to delay other decisions. It also costs us next to nothing up front to allow for our change later. YAGNI should never be used without thought of risk management, we should always be able to quantify our initial cost, the probability of change, and the cost of the change in the future.

I call bull on that. Saying that it using an adapter cost “next to nothing” is wrong and misleading. Leaving aside the problems of trying to expose the advance functionality that you need aside, it also doesn’t work. A good example would be using a repository using NHibernate, which take part in a Unit of Work and uses persistence by reachability and auto flush on dirty.

Chances are, the developers aren’t even aware that they are taking advantage of all of that. Trying to switch that with a web service based repository introduce quite a lot of friction to the mix. I know, I was there, and I saw what it did.

That is leaving aside things like how do you expose concurrency violations or transaction deadlocks using different persistence options. You need to control that, and an adapter is generally either a very leaky abstract or a huge effort to write, and it is still leaky. Worse, using an adapter, you are forced to go with the lowest common denominator for features. Of course you would want to isolate that, you are working at a level so low you might as well be writing things to the disk without the benefit of even a file system. That doesn’t mean that this is the smart choice.

Trying to abstract things away unless you have a firm requirement is just about the definition of YAGNI. And handwaving effort required to build this sort of infrastructure doesn’t really make it go away.

Yes, the approach that I am advocating makes a lot of assumptions. If you remove any of them, the approach that I am advocating is invalid. But when the assumptions are valid (inside a service boundary, using a database, using a smart persistence framework), not making use of that is… stealing from your client.

Arguments against my approach should be made in the context that I am establishing it.

Let me point out a large failure in logic here. You assume an impedance mismatch with a relational database that results in a much higher cost of getting the children with the parents. If I am using other mechanisms like say storing the ShoppingCart as a document in CouchDb that the cost will be nearly identical whether I bring back only the Cart or the Items.

Again, I am talking about something in context. Taking it out of context make the argument invalid. I am going to stop here, because I don’t think that there is any value in parallel monologues. Arguments against the approach that I suggesting should be made in the context that I am outlying my suggestion, not outside it.

The difference between Infrastructure & Application

Recently I am finding myself writing more and more infrastructure level code. Now, there are several reasons for that, mostly because the architectural approaches that I advocate don’t have a good enough infrastructure in the environment that I usually work with.

Writing infrastructure is both fun & annoying. It is fun because usually you don’t have business rules to deal with, it is annoying because it take time to get it to do something that will give the business some real value.

That said, there are some significant differences between writing application level code and infrastructure level code. For that matter, I usually think about this as:

image

Infrastructure code is usually the base, it provides basic services such as communication, storage, thread management, etc. It should also provide strong guarantees regarding what it is doing, it should be simple, understandable and provide the hooks to understand what happens when things go wrong.

Framework code is sitting on top of the infrastructure, and provide easy to use semantics on top of that. They usually take away some of the options that the infrastructure give you in order to present a more focused solution for a particular scenario.

App code is even more specific than that, making use of the underlying framework to deal with much of the complexities that we have to deal with.

Writing application code is easy, it is a single purpose piece of code. Writing framework and infrastructure code is harder, they have much more applicability.

So far, I don’t believe that I said anything new.

What is important to understand is that practices that works for application level code does not necessarily work for infrastructure code. A good example would be this nasty bit of work. It doesn’t read very well, and it has some really long methods, and… it handle a lot of important infrastructure concerns that you have to deal with. For example, it is completely async, has good error handling and reporting and it has absolutely no knowledge about what exactly it is doing. That is left for higher level pieces of the code. Trying to apply application code level of practices to that will not really work, different constraints and different requirements.

By the same token, testing such code follow a different pattern than testing application level code. Tests are often more complex, requiring more behavior in the test to reproduce real world scenarios. And the tests can rarely be isolated bits, they usually have to include significant pieces of the infrastructure. And what they test can be complex enough as well.

Different constraints and different requirements.

NHibernate 2nd Level Cache

NHibernate has a builtin support for caching. It sounds like a trivial feature at first, until you realize how significant it is that the underlying data access infrastructure already implements it. It means that you don’t have to worry about thread safety, propagating changes in a farm, built smart cache invalidation strategies or deal with all of the messy details that are usually along for the ride when you need to implement a non trivial infrastructure piece.

And no, it isn’t as simple as just shoving a value to the cache.

I spent quite a bit of time talking about this here, so I wouldn’t go about all the cache internals and how they work, but I’ll mention the highlights. NHibernate internally has the following sub caches:

  • Entity Cache
  • Collection Cache
  • Query Cache
  • Timestamp Cache

NHibernate make use of all of them in a fairly complex way to ensure that even though we are using the cache, we are still presenting a consistent view of the cache as a mirror of the database. The actual details of how we do it can be found here.

Another thing that NHibernate does for us when we update the cache is try to maintain the consistent view of the world even when using replicated caches used in a farm scenarios. This requires some support from the caching infrastructure, such as the ability to perform a hard lock on the values. Of the free caching providers for NHibernate, only Velocity support this, which means that when we evaluate a cache provider for NHibernate to be used, we need to take this into account.

In general, we can pretty much ignore this, but it does have some interesting implications with regards to what are the isolation guarantees that we can make based on the cache implementation that we use, the number of machines we use and the cache concurrency strategy that we use.

You can read about this here and here.

One thing that you should be aware of is that NHibernate currently doesn’t have transaction cache concurrency story, mostly because there is no cache provider that can give us that. As such, be aware that if you require serializable isolation level to work with your entities, you cannot use the 2nd level cache. The 2nd level cache currently guarantee only read committed (and almost guarantee repeatable read if this is the isolation level that you use in the database). Note that this guarantee is made for read-write cache concurrency mode only.

Find the bug

Can you find the bug in here?

public void Receiver(object ignored)
{
	while (keepRunning)
	{
		using (var tx = new TransactionScope())
		{
			Message msg;
			try
			{
				msg = receiver.Receive("uno", null, new TimeSpan(0, 0, 10));
			}
			catch (TimeoutException)
			{
				continue;
			}
			catch(ObjectDisposedException)
			{
				continue;
			}
			lock (msgs)
			{
				msgs.Add(Encoding.ASCII.GetString(msg.Data));
				Console.WriteLine(msgs.Count);
			}
			tx.Complete();
		}
	}
}

And:

[Fact]
public void ShouldOnlyGetTwoItems()
{
	ThreadPool.QueueUserWorkItem(Receiver);

	Sender(4);

	Sender(5);

	while(true)
	{
		lock (msgs)
		{
			if (msgs.Count>1)
				break;
		}
		Thread.Sleep(100);
	}
	Thread.Sleep(2000);//let it try to do something in addition to that
	receiver.Dispose();
	keepRunning = false;

	Assert.Equal(2, msgs.Count);
	Assert.Equal("Message 4", msgs[0]);
	Assert.Equal("Message 5", msgs[1]);
}

I will hint that you cannot make any part of the receiver after it was disposed.

Esent, identity and the case of the duplicate key

Following up on a bug report that I got from a user of Rhino Queues, I figured out something very annoying about the way Esent handles auto increment columns.

Let us take the following bit of code:

using (var instance = new Instance("test.esent"))
{
	instance.Init();

	using (var session = new Session(instance))
	{
		JET_DBID dbid;
		Api.JetCreateDatabase(session, "test.esent", "", out dbid, CreateDatabaseGrbit.OverwriteExisting);

		JET_TABLEID tableid;
		Api.JetCreateTable(session, dbid, "outgoing", 16, 100, out tableid);
		JET_COLUMNID columnid;

		Api.JetAddColumn(session, tableid, "msg_id", new JET_COLUMNDEF
		{
			coltyp = JET_coltyp.Long,
			grbit = ColumndefGrbit.ColumnNotNULL |
					ColumndefGrbit.ColumnAutoincrement |
					ColumndefGrbit.ColumnFixed
		}, null, 0, out columnid);

		Api.JetCloseDatabase(session, dbid, CloseDatabaseGrbit.None);
	}
}

for (int i = 0; i < 3; i++)
{
	using (var instance = new Instance("test.esent"))
	{
		instance.Init();

		using (var session = new Session(instance))
		{
			JET_DBID dbid;
			Api.JetAttachDatabase(session, "test.esent", AttachDatabaseGrbit.None);
			Api.JetOpenDatabase(session, "test.esent", "", out dbid, OpenDatabaseGrbit.None);

			using (var table = new Table(session, dbid, "outgoing", OpenTableGrbit.None))
			{
				var cols = Api.GetColumnDictionary(session, table);
				var bytes = new byte[Api.BookmarkMost];
				int size;
				using (var update = new Update(session, table, JET_prep.Insert))
				{
					update.Save(bytes, bytes.Length, out size);
				}
				Api.JetGotoBookmark(session, table, bytes, size);
				var i = Api.RetrieveColumnAsInt32(session, table, cols["msg_id"]);
				Console.WriteLine(i);

				Api.JetDelete(session, table);
			}

			Api.JetCloseDatabase(session, dbid, CloseDatabaseGrbit.None);
		}
	}
}

What do you think is going to be the output of this code?

If you guessed:

1
1
1

I have a cookie for you.

One of the problems of working with low level libraries is that they are… well, low level. As such, they don’t provide all the features that you think they would. Most databases keep track of the auto incrementing columns outside of the actual table. But Esent keep it in memory, and read max(id) from the table on init.

It is an… interesting bug* to track down, I have to say.

* Bug in my code, no in Esent, just to be clear.

Night of the living Repositories

This is a response to Greg’s post in reply to mine. I can’t recall the last time I had a blog debate, they are usually fun to have. I am going to comment on Greg’s post as I read it. So this part is written before I did anything but skim the first paragraph or so.

A few things that I want to clarify:

    • His post was originally intended to be an “alternative to the repository pattern” which he believes “is dead”.
    • Of course this is far from a new idea

First, careful re-reading of the actual post doesn’t show me where I said that the repository pattern is dead. What I said was that the pattern doesn’t take into account advances in the persistence frameworks, and that in many cases, applying it on top of existing persistence framework don’t give us much.

The notion of query objects is also far from my invention, this is just to clear that out, it is a well established pattern that I am particularly fond of.

Now, let us move to the part that I really object to:

What is particularly annoying is the sensationalism associated with this post. It is extremely odd to argue against a pattern by suggesting to use the pattern eh? The suggested way to avoid the Repository pattern is to use the Repository pattern which shortening the definition he provided

Provides the domain collection semantics to a set of aggregate root objects.

So now that we have determined that he has not actually come up with anything new and is actually still using repositories let’s reframe the original argument into what it really is.

I fail to see how I am suggesting the use of the repository pattern in my post. This seems to be a fairly important point in Greg’s argument. And no, I don’t follow how I did that. Using the approach that I outlined in the post, there is no such thing as repository. Persistence concerns are handled by the persistence framework directly (using ISession in NHibernate, for example), queries are handled using Query Objects.

The notion of providing in memory collection semantics is not needed anymore, because that responsibility is no longer in the user code, it is the responsibility of the underlying persistence framework. Also note that I explicitly targeted the general use of the repository, not just the DDD use of it.

The problem here is that the Repository interface is not necessarily Object Oriented. The Repository represents an architectural boundary, it is intended to be a LAYER/TIER boundary. Generally speaking when we define such interfaces we define them in a procedural manner (and with good cause).

Hm, I can see Greg’s point here, but I am not sure that I agree with him here. I would specify it differently .Service boundaries are procedural (be it RPC or message based, doesn’t matter). But a service spans both layers and tiers, and I am not going to try to create artificial boundaries inside my service. And yes, they are artificial. Remember: “A Repository mediates between the domain and data mapping layers…”

A repository is a gateway to the actual persistence store. The persistence store itself may be another service, it is usually a remote machine, and the interface to that is by necessity pretty procedural. Trying to model a repository on top of that would by necessity lead us to procedural code. But that is a bad thing.

The problem is, again, we aren’t attempting to take advantage on the capabilities of the persistence frameworks that we have. We can have OO semantics on top of persistence store, because the responsibility to handle that is in the hands of the persistence framework.

Analyzing the situation given of a CustomerRepository what would happen if we were to want to put the data access behind a remote facade?

I am going to call YAGNI on that. Until and unless I have that requirement, I am not going to think about that. There is a reason we have YAGNI. And there is a reason why I don’t try to give architectural advice without having a lot more context. In this case, we have a future requirement that doesn’t really make sense at all.

What would happen though if we used the “other” Repository interface that is being suggested? Well our remote facade would need to support the passing of any criteria dynamically across its contract, this is generally considered bad contract design as we then will have great trouble figuring out and optimizing what our service actually does.

If I need to make the data access remotely, then I am still within my own service (remember, services can span several machines, and multiple applications can take part of the same service). As such, I know exactly what the requirements and the queries that my remote data access is going to require. More than that, within a  service, I want as much flexibility as I can get.

It is on a service boundary, that I start pouring out concrete and post the armed guards.

I also think that there is some sort of miscommunication, or perhaps, as usual, I split a thought across several posts, because a few posts after the post Greg is talking about, I suggested just what he is talking about.

If you don’t want a LAYER/TIER boundary don’t have one just use nhibernate directly …

That is what I am advocating. Or Linq to Sql, or whatever ( as long as it has enough smarts to support what you need without doing nasty things to your code).

My whole argument is that the persistence framework is smart enough today that we don’t need to roll this stuff by hand anymore!

At this point you probably shouldn’t have a domain either though …

And I call bull on that. Repositories != domain, and how you do data access has nothing to do with how you structure your application.

Something that me & Greg & Udi have discussed quite often in the past is the notion of Command Query Separation. I’ll let Greg talk about that, and then add my comments:

I have had this smell in the past as well but instead of destroying the layering I am building into my domain (with good reason, see DDD by Evans for why) I went a completely different route. I noticed very quickly that it was by some random chance that my fetch plans were being different. I had a very distinct place where things were different, I needed very different fetching plans between when I was getting domain objects to perform a writing behaviour on them as opposed to when I was reading objects to say build a DTO.

Well, yes & no. There are quite a few scenarios in which I want to have a different fetch plan for writing behavior even when using CQS. But before I show the example, I want to point out that it is just an example. Don’t try to nitpick this example, talk about the generic principle.

A simple example would be a shopping cart, and the following commands:

  • AddProduct { ProductId = 12, Quantity = 2}
    • This require us to check if the product already exists in the cart, so we need to load the Items collections
  • Purchase
    • We can execute this with properties local to the shopping cart, so no need to load the items collection, this just charge the customer and change the cart status to Ordered

As I said, this is a simple example, and you could probably poke holes in it, that is not the point. The point is that this is a real example of real world issues. There is a reason why IFetchingStrategy is such an important concept.

That is leaving aside that like all architectural patterns, CQS is something that you shouldn’t just apply blindly. You should make a decision base on additional complexity vs. required scope before using it. And in many applications, CQS, or a separate OLTP vs. Reporting models, are often not necessary.

And yes, they are not big applications, with a lot of users and a lot of data, but they are often important applications, with a lot of complexity and behavior.

Help needed: Writing Domain Specific Languages in Boo – Java Edition

Just came out of a discussion with Manning about my book (which is approach the last few stages before actually printing!), apparently they hit upon the notion that Boo works on both the CLR and the JVM, and are interested in having a Java edition of the book.

Disclaimer: This is early, and anything is subject to change, blah blah blah.

I find this hilarious, since this is Hibernate in Action vs. NHibernate in Action, in reverse. At least, I hope it is. The problem? I am not familiar enough with Java to be able to write a book targeting it, hence this post.

If you are familiar with Java and BooJay (filter 90%), read my book (filter 90%) and think you could help (filter 0%), I would love to talk to you.

NHibernate Tidbit – using without referencing Iesi.Collections

Some people don’t like having to reference Iesi.Collections in order to use NHibernate <set/> mapping. With NHibernate 2.1, this is possible, since we finally have a set type in the actual BCL. We still don’t have an ISet<T> interface, unfortunately, but that is all right, we can get by with ICollection<T>.

In other words, any ISet<T> association that you have can be replaced with an ICollection<T> and instead of initializing it with Iesi.Collections.Generic.HashedSet<T>, you can initialize it with System.Collections.Generic.HashSet<T>.

Note that you still need to deploy Iesi.Collections with your NHibernate application, but that is all, you can remove the association to Iesi.Collections and use only BCL types in your domain model, with not external references.

NH Prof: An important milestone

Yeah, this now works:

image

And if you wonder why I am happy about something that looks very similar to what already worked months ago, you are right.

What you don’t see is that this version of NH Prof uses the revised backend, which I have talked about before. This backend use the pull vs. push mechanism, and it is supposed to allow us much higher performance than before.

And no, this branch is not merge to the public builds yet. Give us a few weeks.

Ode to ReSharper

I am using R# 4.5, and I have to say, for  a release that is supposedly all about perf, they managed to squeeze in some really nice new smarts:

image

Figuring out all the implementations that use or do not use parameters.

image

Killing off BDD :-)

Using Active Record to write less code

The presentations from Oredev are now available, and among them is my talk about Active Record.

You can watch it here, the blurb is:

What would you say if I told you that you can stop writing data access code in .Net? Aren't you tired of writing the same thing over and over again, opening connection, querying the database, figuring out what to return, getting back untype data that you need to start putting on the form? Do you really see some value in writing yet another UPDATE statement?The Active Record framework allows you to fully utilize the power of the database, but without the back breaking work that it used to take. Active Record uses .Net objects to relieve you from the repeating task of persistence. Those objects are schema aware and can persist and load themselves without you needing to write a single line of SQL. Building business application using Active Record is a pleasure, the database stuff just happens, and you are free to implement the business functionality.

To be frank, I consider this to be one of the best presentations that I gave. Everything just ticked.

NHibernate Mapping -

<many-to-any/> is the logical extension of the <any/> feature that NHibernate has. At the time of this writing, if you do a Google search on <many-to-any/>, the first result is this post. It was written by me, in 2005, and contains absolutely zero useful information. Time to fix that.

Following up on the <any/> post, let us say that we need to map not a single heterogeneous association, but a multiple heterogeneous one, such as this:

image

In the database, it would appear as:

image

How can we map such a thing?

Well, that turn out to be pretty easy to do:

<set name="Payments" table="OrderPayments" cascade="all">
	<key column="OrderId"/>
	<many-to-any id-type="System.Int64"
			meta-type="System.String">
		<meta-value value="CreditCard"
			class="CreditCardPayment"/>
		<meta-value value="Wire"
			class="WirePayment"/>
		<column name="PaymentType" 
			not-null="true"/>
		<column name="PaymentId"
			not-null="true"/>
	</many-to-any>
</set>

Now, let us look at how we use this when we insert values:

using (var session = sessionFactory.OpenSession())
using (var tx = session.BeginTransaction())
{
	var order = new Order
	{
		Payments = new HashSet<IPayment>
        {
        	new CreditCardPayment
        	{
        		Amount = 6,
                CardNumber = "35434",
                IsSuccessful = true
        	},
            new WirePayment
            {
            	Amount = 3,
                BankAccountNumber = "25325",
                IsSuccessful = false
            }
        }
	};
	session.Save(order);
	tx.Commit();
}

This will produce some very interesting SQL:

image

image

image

image

image

I think that the SQL make it pretty clear what is going on here, so let us move to a more fascinating topic, what does NHibernate do when we read them?

Here is the code:

using (var session = sessionFactory.OpenSession())
using (var tx = session.BeginTransaction())
{
	var order = session.Get<Order>(1L);
	foreach (var payment in order.Payments)
	{
		Console.WriteLine(payment.Amount);
	}
	tx.Commit();
}

And the SQL:

image

image

image

image

As you can see, this is about as efficient as you can get. We load the order, we check what tables we need to check, and the we select from each of the tables that we found to get the actual values in the association.

True heterogeneous association, not used very often, but when you need it, you really love it when you do.

NHibernate Mapping -

Sometimes, well known associations just don’t cut it. We sometimes need to be able to go not to a single table, but to a collection of table. For example, let us say that an order can be paid using a credit card or a wire transfer. The data about those are stored in different tables, and even in the object model, there is no inheritance association them.

From the database perspective, it looks like this:

image

As you can see, based on the payment type, we need to get the data from a different table. That is somewhat of a problem for the standard NHibernate mapping, which is why we have <any/> around.

Just to close the circle before we get down into the mapping, from the object model perspective, it looks like this:

image

In other words, this is a non polymorphic association, because there is no mapped base class for the association. In fact, we could have used System.Object instead, but even for a sample, I don’t like it.

The mapping that we use are:

<class name="Order"
			 table="Orders">

	<id name="Id">
		<generator class="native"/>
	</id>

	<any name="Payment" id-type="System.Int64" meta-type="System.String" cascade="all">
		<meta-value value="CreditCard" class="CreditCardPayment"/>
		<meta-value value="Wire" class="WirePayment"/>
		<column name="PaymentType"/>
		<column name="PaymentId"/>
	</any>

</class>

<class name="CreditCardPayment"
			 table="CreditCardPayments">
	<id name="Id">
		<generator class="native"/>
	</id>
	<property name="IsSuccessful"/>
	<property name="Amount"/>
	<property name="CardNumber"/>
</class>

<class name="WirePayment"
			 table="WirePayments">
	<id name="Id">
		<generator class="native"/>
	</id>
	<property name="IsSuccessful"/>
	<property name="Amount"/>
	<property name="BankAccountNumber"/>
</class>

Pay special attention to the <any/> element. Any <meta-value/> declaration is setting up the association between the type as specified in the PaymentType column and the actual class name that it maps to. The only limitation is that all the mapped class must have the same data type for the primary key column.

Let us look at what this will give us:

using (var session = sessionFactory.OpenSession())
using (var tx = session.BeginTransaction())
{
	var order = new Order
	{
		Payment = new CreditCardPayment
		{
			Amount = 5,
			CardNumber = "1234",
			IsSuccessful = true
		}
	};
	session.Save(order);
	tx.Commit();
}

Which produces:

image

image

And for selecting, it works just the way we would expect it to:

using (var session = sessionFactory.OpenSession())
using (var tx = session.BeginTransaction())
{
	var person = session.Get<Order>(1L).Payment;
	Console.WriteLine(person.Amount);
	tx.Commit();
}

The generated SQL is:

image

image

An interesting limitation is that you cannot do an eager load on <any/>, considering the flexibility of the feature, I am most certainly willing to accept that limitation.

Test refacotring

I just posted about a horribly complicated test, I thought I might as well share the results of its refactoring:

[TestFixture]
public class IndexedEmbeddedAndCollections : SearchTestCase
{
	private Author a;
	private Author a2;
	private Author a3;
	private Author a4;
	private Order o;
	private Order o2;
	private Product p1;
	private Product p2;
	private ISession s;
	private ITransaction tx;

	protected override IList Mappings
	{
		get
		{
			return new string[]
			{
				"Embedded.Tower.hbm.xml",
				"Embedded.Address.hbm.xml",
				"Embedded.Product.hbm.xml",
				"Embedded.Order.hbm.xml",
				"Embedded.Author.hbm.xml",
				"Embedded.Country.hbm.xml"
			};
		}
	}

	protected override void OnSetUp()
	{
		base.OnSetUp();

		a = new Author();
		a.Name = "Voltaire";
		a2 = new Author();
		a2.Name = "Victor Hugo";
		a3 = new Author();
		a3.Name = "Moliere";
		a4 = new Author();
		a4.Name = "Proust";

		o = new Order();
		o.OrderNumber = "ACVBNM";

		o2 = new Order();
		o2.OrderNumber = "ZERTYD";

		p1 = new Product();
		p1.Name = "Candide";
		p1.Authors.Add(a);
		p1.Authors.Add(a2); //be creative

		p2 = new Product();
		p2.Name = "Le malade imaginaire";
		p2.Authors.Add(a3);
		p2.Orders.Add("Emmanuel", o);
		p2.Orders.Add("Gavin", o2);


		s = OpenSession();
		tx = s.BeginTransaction();
		s.Persist(a);
		s.Persist(a2);
		s.Persist(a3);
		s.Persist(a4);
		s.Persist(o);
		s.Persist(o2);
		s.Persist(p1);
		s.Persist(p2);
		tx.Commit();

		tx = s.BeginTransaction();

		s.Clear();
	}

	protected override void OnTearDown()
	{
		// Tidy up
		s.Delete("from System.Object");

		tx.Commit();

		s.Close();

		base.OnTearDown();
	}

	[Test]
	public void CanLookupEntityByValueOfEmbeddedSetValues()
	{
		IFullTextSession session = Search.CreateFullTextSession(s);

		QueryParser parser = new MultiFieldQueryParser(new string[] { "name", "authors.name" }, new StandardAnalyzer());

		Lucene.Net.Search.Query query = parser.Parse("Hugo");
		IList result = session.CreateFullTextQuery(query).List();
		Assert.AreEqual(1, result.Count, "collection of embedded (set) ignored");
	}

	[Test]
	public void CanLookupEntityByValueOfEmbeddedDictionaryValue()
	{
		IFullTextSession session = Search.CreateFullTextSession(s);
		
		//PhraseQuery
		TermQuery  query = new TermQuery(new Term("orders.orderNumber", "ZERTYD"));
		IList result = session.CreateFullTextQuery(query).List();
		Assert.AreEqual(1, result.Count, "collection of untokenized ignored");
		query = new TermQuery(new Term("orders.orderNumber", "ACVBNM"));
		result = session.CreateFullTextQuery(query).List();
		Assert.AreEqual(1, result.Count, "collection of untokenized ignored");
	}

	[Test]
	[Ignore]
	public void CanLookupEntityByUpdatedValueInSet()
	{
		Product p = s.Get<Product>(p1.Id);
		p.Authors.Add(s.Get<Author>(a4.Id));
		tx.Commit();

		QueryParser parser = new MultiFieldQueryParser(new string[] { "name", "authors.name" }, new StandardAnalyzer());
		IFullTextSession session = Search.CreateFullTextSession(s);
		Query query = parser.Parse("Proust");
		IList result = session.CreateFullTextQuery(query).List();
		//HSEARCH-56
		Assert.AreEqual(1, result.Count, "update of collection of embedded ignored");

	}
}

It is almost the same as before, the changed are mainly structural, but it is so much easier to read, understand and debug.