Ayende @ Rahien

Refunds available at head office

Not so Impressive

I had hard time deciding what examples I should give about what is impressive to me and what is not. I didn't want to use the real world scenarios, but I did want to give concrete examples. Fortunately, I did manage to think of a couple of good examples.

Distributed Grid and Distributed Caching. Both of those are considered hard. In the sense that it takes a lot of expertise to build them. They are also one of those pure technical issues that developers love, no need to deal with pesky tax laws or understand why delinquent customers are given more credit, just pure programming bliss.

The problem is that those are just aren't impressive on their own. Building a distributed cache is easy and fun. There is nothing complicated going on there. Distributed Grid sounds complex, at first, but it is a very simple technical challenge. It is slightly more complex if you want to implement it with automatic binary distribution (that is, you don't need to manually deploy dlls to the machines in the grid, it happens for you), but even then, it is firmly in the realm of the easy to do.

What is impressive in such a scenario is how you solve the management problem. How do you gracefully recover from a failed worker node on the grid? How do you handle another node adding itself to the cache? What happen if a server crash?

Handling those problem is an interesting, challenging and impressive. Because it require a bit of thinking beyond just technical expertise.

NHibernate 2.0 and Linq

Linq for NHibernate is not part of the 2.0 release. Linq support is planned for the 2.1 release. That said, we have been getting a lot of questions about that.

The technical reasons are not really interesting, but suffice to say that to provide good Linq support we also need to modify NHibernate slightly. Those changes happens on the trunk, which is what Linq for NHibernate is following.

However, due to all the questions that we got, I wanted to point out that Daniel Guenter has back ported the current version of Linq to NHibernate to Nhibernate 2.0 and made it available here.

Now, the disclaimer. We are not supporting this. This is an unofficial (but welcome) contribution from the community. It is likely to have bugs (in fact, we know it contains bug in the more complex Linq queries, that is why it is still under active development), and the NHibernate team response is most likely going to be, use the latest, which requires the NHibernate trunk, not the NHibernate 2.0 version.

Use at your own risk, etc.

Persistent DSL caching issues

A while ago I talked about persistent DSL caching. I was asked why my solution was not a builtin part of Rhino DSL.

The reason for that is that this is actually a not so simple problem. Let me point out a few of the issues that are non obvious.

  • Need to handle removal of scripts
  • Need to handle updating scripts
  • Need to handle new scripts

Those are easy, sort of, but what about this one?

  • Need to handle DSL updates

When you are in development mode, you really need to know that changing the way the DSL behaves would also invalidate any cache.

I like to keep a very high bar of quality on the software I make, and there is a fine distinction between one off attempts and reusable ones. One off attempts can be hackish and stupid. Reusable implementations should be written properly.

And no, there isn't anything overly complex here. Just time to test all bases.

Anyone feels like sumbiting a patch?

Impressive

It doesn't take a lot to impress me. All you have to do is show me something that I didn't think about, or do something that I consider difficult and valuable. If it is just difficult, it is not really interesting, and if it is valuable but obvious, it is not really impressive. If it is not valuable, there is no point in spending any time on it.

It seems like common sense, to me. But I had four different occasions recently where I was shown things where I got the strong impression that I was supposed to be impressed. I wasn't. I really wasn't, in a few cases.

In two cases, it was doing things in the really hard way, and I was able to point out easier ways to approach the problem. One was just not interesting to me and in the case of the last two, I kept asking, "and... ?", trying to figure out exactly why I was supposed to be impressed. Both were a duplication of existing work, and I couldn't figure out what new there.

Oh, there were new things, but none of them was something that couldn't be built in less than a day or two. So they didn't pass the difficult bar.

If you want people to be impressed, you have better do something impressive. Sorry, but I have hard time of getting excited of Yet Another Xyz or Overcomplicated Solution Abc.

Implementing generic natural language DSL

I said that I would post about it, so here is the high level design for generic implementation of natural language looking parsing. Let us explore the problem scenario first. We want to be able to build this language, without having to build a full blown language from scratch:

open http://www.ayende.com/
click on link to Blog
click on link to first post
enter comment with name Ayende Rahien and email foo@example.org and url http://www.ayende.com/Blog/
enter comment text This is an awesome post.
click on submit
comment with This is an awesome post should appear on page

And to prove that we are not focusing on a single language, let us try this one as well:

when account balance is 500$ and withdrawal is made of 400$ we should get a low funds alert
when account balance is 500$ and withdrawal is made of 501$ we should deny the transaction
when weekly international charge is at 3,500$ and max weekly international charge is of 5,000$ and new charge arrives for amount 2,230$ we should deny the transaction

I think that those are divergent enough to show that the solution is a generic one.

And now, to the solution. Each type of language is going to have its own DSL engine, which know how to deal with the particular dialect that we are using. The default parsing is a three steps solution. First, split the text into sentences, then, split each sentence to tokens by whitespace. Now, for each statement, we search for the appropriate statement resolve, which is a class that knows how to deal with it. The statement resolver methods are then called to process the statement.

There are two key principal to the design. First, turning something like 'click on link' to an invocation of the ClickOnLink statement resolver and lazy parameter evaluation.

This is going to be interesting, the time right now is 19:38, and I am going to start implementing this.

It is now 22:04, and I finished the first language.

Working on the second now. It is 22:10 and I am done with the second one.

What did I do?

I took the text we had and turn that into executable commands. Now, this isn't flexible at all. If you make a modification in the way it is structured, it will fail, coming back to why natural language is a bad choice here, but it had quite a bit of flexibility in it.

You can get the code for this, including tests, here: https://rhino-tools.svn.sourceforge.net/svnroot/rhino-tools/experiments/natrual-language

But let us talk for a bit about how this is implemented. I'll show the bank example, because it is easier.

We start by defining the BankParser, which looks like this:

image

The bank parser merely define what the statement resolvers are, and any special parsers that are needed (in this case, we need to handle dollar values).

A statement parser is trivial:

image image

 

And yes, those are pure POCO classes.

The whole idea here was that I can implement some smarts into the default engine about how it recognize methods and resolve parameters. I will admit that overloading caused some issues, but I think that this is pretty simple implementation.

It also does a good job in demonstrating the problems in such a language. Go ahead and try to build operator precedence into it. Or implement an if statement. You really can't, not without introducing a lot more structure into it. And that would turn it into yet another programming language.

What about the tooling? Intellisense and syntax highlighting?

Well, since we have the structure of the code, and we know the conventions, you shouldn't have a problem taking my previous posts about this and translating them directly into supporting this.

And yes, I can create a language in this in a few minutes, As BankParser has proven.

The search for the natural language

Jeremy Miller is looking for a DSL that reads like natural language. My immediate response was that it is not practical, because I assumed he wanted very natural language, which is still not possible to do without extremely high budget. Limiting the problem to just reads like a natural language reduce the problem space significantly.

I am going to have a separate post about how to actually solve such a problem, but for now, I want to talk about the actual requested solution. I think it is 100%solvable with a low cost approach. That is, you can get a DSL that reads like English in under an hour. But I don't think it is valuable.

English is a terrible language to express instructions in. Any natural language is terrible in expressing instructions, just find the nearest army sergeant, they will tell you that.

Let us take a look at a language that actually took this approach:

tell application "Finder"   
set the percent_free to ¬ (((the free space of the startup disk) / (the capacity of the startup disk)) * 100) div 1
end tell
if the percent_free is less than 10 then tell application (path to frontmost application as text)
display dialog "The startup disk has only " & the percent_free & ¬
" percent of its capacity available." & return & return & ¬ "Should this script continue?" with icon 1 end tell
end if

This is apple script. From my point of view, this is horrible. It is unreadable in the extreme. More than that, trying to explain how this language works, or how it handles error is a non trivial task.

This has been my experience any time I actually tried to create a natural language like syntax. It is too complex, and users get annoyed when they can't use real natural language.

From my perspective, getting an expressive DSL does not means that it has to read like an English statement. In fact, it probably shouldn't. Too much noise involved. A structured approach isn't just to help the compiler, but to help the reader.

Build the tools that aren't there

This is strongly related to my posts about Tools Matter. During the last six months had several conversations with people about their Xyz processes. In several cases, my response was a polite version of: "This Is Broken, Badly" and "You should automate this part".

I am a great believer in automating just about everything that move (and if it doesn't move, kick it until it does!). In those conversation, the response was usually, "Yeah, we thought about doing that, but Abc does things in a way I don't like and Efg isn't compatible with our Foo requirements". And that was that.

Let us take deployment as good example of that. I was talking with someone about the need for automatic deployment, and he mentioned that he is waiting for a tool to come up that will also handle workflows.

I was a bit stunned by that, and inquired deeper, at which point it became clear that the guy was working in a highly regulated environment and doing a deployment involved multiple people authorizing it in different environments before it could go live. Because of all the manual work that already exists there, which cannot be changed for regulatory reasons, they have no automated deployment.

Note, it sounds much worse than it actually is, now that I re-read this.

I was critical on this approach, for several reasons. First, even if you can't go all the way, just having a build script that you have to manually run is a huge improvement. Next, I asked several additional questions bout the scenario, and it turned out that the process was something like this:

  • We need to push something to production
  • We deploy to a test server and ask QA to test that
  • Once QA sign off this release, deploy to a staging server
  • Get at least three business experts to smoke test the system
  • Once we have 3 signatures that authorize the system, we can ask Joe (the friendly IT admin) to deploy to production
  • Joe schedule a time for deployment and get signoff for that from someone with the authorization to do so.
  • At the specified time, Joe is going to deploy to production, the dev team is on call for issues there

This isn't a good going-to-production scenario, with multiple check points to ensure that we are safe & sound. The real reason is not to actually ensure quality, of course, it is to satisfy some dry regulation and have an audit trail that you can point to. But that is beside the point and shows my utter annoyance with all forms of bureaucracy.

Okay, so we have this process that we must go through in order to get something to production. There is no tool out there that will do it for us and give us the required audit trail. Therefor, we can't use automatic deployments.

My response for that was rude and unprintable.

Here is the deal, let us estimate the cost of building such a system:

  • A page into which I can enter request to go to production. This consists of two text boxes and a submit button. On submit:
    • Automatically deploy to the test server
    • Send an email to me and the QA department that we have something that they need to test
    • Record that I have started a deployment process
  • A page into which the QA department can say if they authorize the build or not. On submit:
    • If not:
      • record this fact
      • email to dev team
    • If yes
      • Record this fact
      • Automatically deploy to staging server
      • Email all the people that can approve a build and ask them to evaluate the build
  • A page into which the business experts can authorize or block a build. On submit:
    • If not:
      • record the reason
      • email to dev team and QA
    • If yes:
      • Record this fact
      • If this is the third person to authorize this build (and if there are no blocks):
        • record this fact
        • Send email to Joe, asking him to setup time for deployment
  • A page for Joe to enter time for going to production, on Submit:
    • record this fact
    • Send email to whoever it is that can authorize production downtime
    • Generate deployment package (which Joe will run in production)
  • A page for authorizing scheduled downtime:
    • Record this fact
    • Email Joe that the time is approved
    • Email whoever is interested that there will be scheduled downtime at that time

Five pages, more or less. And yes, I am glossing over things, I know. That is not the point.

If it takes over a week to build this I would be very surprised. The benefit is that we have a more streamlined process, we no longer have to babysit multiple manual deployments and Joe doesn't get some word document with instruction as to how to deploy. He gets a deployment package that he can copy to production and double click in order to deploy this.

Let us take another scenario. Deploying to production often fails because of one problem or the other, usually the IT admins who performed the install gave bad values to the build script. (Such as specifying the wrong connection string, or have a typo in some URL, or something of this nature). The second time that this has happened, it should be caught by the build script itself. The response I got when I expressed this opinion was that they had no control over the build process, that it was entirely the realm of the IT administrators.

Let us take the most difficult scenario that I can think of. We are required to hand to the IT admins the compiled binaries along with a document that specify what new values we should put in the configuration.

My approach for this would be to put an if statement in the application startup, which will perform a full environment check (the idea was stolen from Jeremy Miller and Release It!, by the way) and give a detailed error message. Since this is likely to be a long process, it will disable itself the first time it passes successfully (I leave the how as an exercise for the reader, consider that it should be reset the next time we deploy).

The tool to do that is code, your code. Which you built in order to provide you with the foundation for your project.

In short, because I have another half dozen examples that are as applicable, remember, you are a developer. If your tools doesn't provide you with what you need, you can build it. And since you are not going to try to build a generic tool, the cost of doing it is extremely low.

It doesn't even have to be a tool, just create a console application that does something, where you hard code everything. Let the compiler be your tool, and "configure" it with code.

Don't wait, act.

Requirements 101: Have an automated deployment

imageIf you don't have an automated deployment, it generally means that you are in a bad position. By automated, I mean that you should be able to push a new version out by double clicking something. If you can't get automated deployment script in under an hour, you most certainly have a problem.

Sometimes, the problem is with the process, you don't have the facilities to do an automated deployment because parts of the deployment is sitting in people's head (oh, you need to configure IIS to use Xyz with the new version), in other cases, it isn't there simply because people haven't tried.

Yet, automated deployment is one of those things that you can create in isolation, without getting commitment or support from the rest of the team. This is usually the first thing that I do in any project with existing codebase that I come to.

It is also a good way of taking care of problems in the process. If you have a hard time deploying because you database change management process is broken, you need to fix that before you can get an automatic deployment ready.

Also, notice that I am explicitly talking about automated deployment, not about having a build script. One of the requirements for automated deployment is a build script, but that is just one of them.

I don't care that you can or can't build the software, I care that you can deploy this successfully. And yes, this include doing things like deploying to several machines, stopping and starting services, updating the database schema and applying any data migration processes, and even doing rolling update, if this is a requirement.

Remember, automated.

And I'll leave you with just one final thought: Prayer should not be part of the steps in the deployment process.

Getting Things Done, On Time

Today I implemented refactoring support for a DSL. Basically, it is Extract Business Condition, and it was explicitly modeled after the way R# handles Extract Variable. It even share the same shortcut, ctrl+alt+v.

I also took a stub in implementing automatic pattern recognition, so when the system recognize a common usage pattern, it will automatically refactor it to a high level abstraction. It works, although I think that I can make it even more flexible than it is now.

Now, if someone from the Resharper team is actually reading this, they would know that I am lying. There is no way of doing something of this magnitude in just one day, not even if you have an extremely helpful compiler. And they would be right. I didn't try to tackle a feature of this magnitude. What I did do was to find the most common scenario for this feature and nail that.

I am taking this approach explicitly and deliberately. With the end result that I get to show value very rapidly. And yes, the customer is made aware of the limitations of this approach. I also tell them that they can get the feature by tomorrow, with error message if they are trying to do something that is not supported.

Trying to support 100% is hard, trying to support just 20% turn out to be (not quite) easy. And now you get to nitpick it to death, I won't respond for about a day, since I am just about to board a flight.


Choose a workshop

I am going to give a workshop or two at the ALT.Net Austin in the end of October. Those will be free (as in beer) and will be recorded & available on the net afterward. Right now I want to do on on writing DSLs, but I have another which is basically blank at the moment. I have too many subjects that I can talk about, and too many levels at which I can talk about them.

So, this is your chance to help me. If you are going to be there, what would you like to have a workshop about?

And no, a question like NHibernate is not acceptable, it is  too broad. Are we talking about NHibernate best practices, high scalability, tips and tricks or advance usages. I can do a three hours workshop on any of them.

Suggestions?

Answer: Don't stop with the first DSL abstraction

The problem as it was stated was of rules that looked like this:

upon bounced_check or refused_credit:
	if customer.TotalPurchases > 10000: # preferred
		ask_authorization_for_more_credit
	else:
		call_the cops 

upon new_order:
	if customer.TotalPurchases > 10000: # preferred
		apply_discount 5.precent 

upon order_shipped:
	send_marketing_stuff unless customer.RequestedNoSpam 

I don't like it, and the reason isn't just that we can introduce IsPreferred.

I don't like it because the abstraction facilities here are poor. We have basically introduced events and business rules, maybe with a sprinkling of a domain model, but nothing really meaningful. Such system will die under their own weight in any situation of significant complexity (in other words, in all real world situations).

Let us consider the problem in reverse, shall we? We have various conditions and actions upon which we can act. But the logic is scattered all over the place, making it hard to read, modify, understand and work with. When such a system compose of the lifeblood of the business, the business usually adapts, and starts to talk in the terms of the system. However, they tend to lose the ability to think about things in way that would be more meaningful.

I listened today to a business person trying to explain some concept that he wanted to make. It took him several tries to explain the business problem because he was focused on the technical one. The system has a corrupting affect on it. I call this the Babel Syndrome, the reverse of DDD's ubiquitous language.

Let us see if we can get a high level of meaning out of the above DSL, shall we? First, we restate our problem, instead of dealing with events and conditions for responding the events, we deal with business responses for scenarios. It doesn't sound like much of a difference, but in actuality, there is a big difference between the two.

The most important of those differences is the change from handling the events to handling a business scenario in a given context. In other words, instead of asking what we should do when a check is bounced, we need to ask a totally different question. "When the customer is preferred, what should the response be for bounced check?"

This is anything but a minor change in the the way we think about the language and how we operate on it. Let us see the DSL script, after which we can discuss how it affects us. These are the contents of the default.boo file:

upon order_shipped:
	send_marketing_stuff unless customer.RequestedNoSpam

upon bounced_check or refused_credit: 
call_the cops

This will be executed for all orders, like before. Now, let us look at preferred_customer.boo, and what concepts it express.

when customer.TotalPurchases > 10000 # preferred

upon new_order:
	apply_discount 5.precents

upon bounced_check or refused_credit:
	ask_authorization_for_more_credit

And now we are getting to see some of the more interesting parts of the difference. We are now talking in terms of a business scenario. When we have a preferred customer, and something happen, how should we respond?

This change is a well known refactoring: conditional to polymorphism. In other words, we just created the strategy pattern with a DSL. The difference here is that the script have an active role in deciding whatever it can deal with the scenario or not (in other words, chain of responsibility, and the pattern I am going to mention).

When we need to handle some business scenario, we are going to execute all the scripts, with the default.boo being the last one to run. If any of the scripts accepted the scenario as valid and has specific action to take, it has the option to do so.

Enough about the implementation, let us go back to the concepts. We can make now talk to the business people in a way that is far more concise and natural. Instead of having to focus on all permutations of a possible event, we can now talking about a specific scenario and how we handle the business event in that context. Not only is this more readable, it is easier by far to actually define such things as what is the meaning of a preferred customer. I can open the DSL and actually read it.

Similar approaches are very useful when you recognize that the code is asking to be given a more explicit shape than just generic rules. Don't let your DSL be whatever you started with. Find and actively extract higher level meanings whenever it is possible.

A deeper examination of this DSL, how to build and use it is likely to compose most of chapter 13, as a real world example of a complex DSL. Who do you think?

Given this approach, how would you design an offer management DSL?

Challenge: Don't stop with the first DSL abstraction

I was having a discussion today about the way business rules are implemented. And large part of the discussion was focused on trying to get a specific behavior in a specific circumstance. As usual, I am going to use a totally different example, which might not be as brutal in its focus as the real one.

We have a set of business rules that relate to what is going to happen to a customer in certain situations. For example, we might have the following:

upon bounced_check or refused_credit:
	if customer.TotalPurchases > 10000: # preferred
		ask_authorizatin_for_more_credit
	else:
		call_the cops

upon new_order:
	if customer.TotalPurchases > 10000: # preferred
		apply_discount 5.precent
upon order_shipped:
send_marketing_stuff unless customer.RequestedNoSpam

What is this code crying for? Here is a hint, it is not the introduction of IsPreferred, although that would be welcome.

I am interested in hearing what you will have to say in this matter.

And as a total non sequitur, cockroaches at Starbucks, yuck.

Persistent DSL Caching

This is a note to myself, because I don't have the time for a proper post. When you are dealing with a DSL that contains more than just a few scripts, you really being to care about compilation times. Even with caching, this can be a problem.

The solution is the same that we have been using for the last three to four decades, don't compile if the source hasn't changed.

The code to make this happen using Rhino DSL is here:

public override CompilerContext Compile(string[] urls)
{
    var outputAssemblyName = OutputAssemblyName(urls);
    if (CanUseCachedVersion(outputAssemblyName, urls))
        return new CompilerContext { GeneratedAssembly = Assembly.Load(File.ReadAllBytes(outputAssemblyName)) };
    return base.Compile(urls);
}

private bool CanUseCachedVersion(string outputAssemblyName, string[] urls)
{
    var asm = new FileInfo(outputAssemblyName);
    if(asm.Exists==false)
        return false;
    foreach (var url in urls)
    {
        if(File.GetLastWriteTime(url) > asm.LastWriteTime)
            return false;
    }
    return true;
}

And in the CustomizeCompiler method:

protected override void CustomizeCompiler(BooCompiler compiler, CompilerPipeline pipeline, string[] urls)
{
    compiler.Parameters.OutputAssembly = OutputAssemblyName(urls);
    // add implicit base class here...
    if (pipeline.Find(typeof(SaveAssembly)) == -1)
        pipeline.Add(new SaveAssembly());
}

It is impossible to overstate how big a difference this can make.

Chapter 11 - Documenting your DSL

I am going to start writing this soon. Any suggestions?

As a bait, I am going to apply the techniques that I intend to describe to Binsor, which should give you incentive to say what you really want from a DSL documentation... :-)

Chapter 10 is done, happy

A few hours ago I finished writing Chapter 10. Looking at the time on the calendar, it didn't take too long, about a month. Looking at it from the point of view of effort involved, I feel about as tired as if it took three years.

Other chapters took longer, but didn't take out quite as much effort. Not sure what this means... but I am happy I am done with it (for a given value of done, I will still need to review and read it at least three dozen times).

A balancing act

image Probably one of the hardest challenges that I am facing with writing the book is to know what to say and what to leave unsaid.

Phrasing it another way, it is choosing at what level to talk to the reader. On the one hand, I really want the reader to be able to make immediate use of the concepts that I am talking about, which drive me to do more practical demonstrations, code samples and covering more common situations. On the other hand, those take up a lot of room, and they tend to be boring if you don't need exactly what you need right this moment.

High level concepts, open ended possibilities and assuming a bit about the reader knowledge level makes for a book that is much more narrowly focused, and I think that it more valuable. However, it also tend to leave readers unsatisfied, because not everything is explained.

Currently I am writing a UI focused chapter, and to get a good experience from the UI you need to invest a lot of time. Metric tons of it. I am trying to chart the way and show how this can be done, but without getting mired in all the actual minute details.

This is a tough balancing act, and I am not sure if I am succeeding.

Is this valuable?

Going back to the theme of visually editing a DSL, we have something like this:

image

The backend to this cutey is pattern recognition with custom UI on top. You have to teach it about each pattern that you use with it, but you can get pretty smart with how it works while using a fairly brute force (and simple) techniques.

Thoughts?

Nullifying Null

One of the more annoying problems with  building rules that are also code is that you have to deal with code related issues. One of the more common ones is NullReferenceException.

For example, let us say that we have the following rule:

when Order.Amount > 10 and Customer.IsPreferred:
      ApplyDiscount 5.precent 

We also support a mode in which a customer can create an order without actually registering on the site (anonymous checkout).

In this scenario, the Customer property is null. We can rewrite the rule to look like this:

when Order.Amount > 10 and Customer is not null and Customer.IsPreferred:
ApplyDiscount 5.precent

But I think that this is extremely ugly. We can also decide to return a default instance of the customer when it is not there, but here I want to show you another way to handle this. We define the rule as invalid when Customer is not there, so it should not be run. The question is how we can know that.

The dirty way is to do something like this:

var referencesCustomer = File.ReadAllText(ruleName).Contains("Customer");
if(referencesCustomer && Customer == null)
   return;

If you gagged when seeing this code, that is a good sign. Let us solve this properly. First, we want some help from the compiler, so let us inspect the when() meta method that we have seen in the previous post a little closer.

[Meta]
public static ExpressionStatement when(Expression condition, BlockExpression action)
{
	var ctor = new BlockExpression();

	var conditionFunc = new Block();
	conditionFunc.Add(new ReturnStatement(condition));

	ctor.Body.Add(
		new BinaryExpression(
			BinaryOperatorType.Assign,
			new ReferenceExpression("Condition"),
			new BlockExpression(conditionFunc)
			)
		);

	Expression serialize = new CodeSerializer().Serialize(condition);
	Builder.Revisit(serialize);

	ctor.Body.Add(
		new BinaryExpression(
			BinaryOperatorType.Assign,
			new ReferenceExpression("ConditionExpression"),
			serialize
			)
		);

	ctor.Body.Add(
		new BinaryExpression(
			BinaryOperatorType.Assign,
			new ReferenceExpression("Action"),
			action
			)
		);

	return new ExpressionStatement(
		new MethodInvocationExpression(
			ctor
			));
}

We take the cal to the when method and transform it to the following code:

delegate
{
	Condition = () => Order.Amount > 10 && Customer.IsPreferred;
	ConditionExpression = (Expression<Func<bool>>)() => Order.Amount > 10 && Customer.IsPreferred;
	Action = delegate
	{
		// not interesting for this post
	};
}();

I am translating to C# 3.0 here in order to make it easier to grasp the concept. The real code is in Boo, of course, and is more interesting. The most fascinating concept here is the use of CodeSerializer, which will turn the condition that we passed into an AST that we can access at runtime. I tried to simulate that by doing an explicit cast to expression tree, which would give similar result in C#).

Having the AST of the code at runtime, even if we don't want to change it (a totally different concept) is incredibly powerful. In this case, we are going to use this to detect when we are referencing a null property and marking the rule as invalid.

Here is the code:

public void Evaluate()
{
	var references = new List<string>();
	new InlineVisitor
	{
		OnReferenceExpression = r => references.Add(r.Name);
	}.Visit(ConditionExpression);
	if(references.Contains("Customer") && Customer == null)
		return;// rule invalid
	if(Condition())
		Action();
}

This is a very simple example of how you can add smarts to the way that your code behaves. This technique is the foundation for a whole host of options. I am using similar approaches for adaptive rules and for auditable actions. Fun stuff, if I say so myself.

How to execute a set of statements in an expression

One of the most common problems with using Boo's Meta Methods is that they can only return expressions. This is not good if what you want to return from the meta method is a set of statements to be executed.

The most common reason for that is to initialize a set of variables. Obviously, you can call a method that will do it for you, but there is a simpler way.

Method invocation is an expression, and anonymous delegate definition is also an expression. What does this tells you? That this is an expression as well:

// the whole thing is a single expression.
delegate (int x)
{
	Console.WriteLine(x);
	Console.WriteLine("another statement");
}(5);

You generally won't use something like that in real code, but when you are working with the AST directly, expressions vs. statements require a wholly different world view.

Anyway, let us see how we can implement this using Boo.

[Meta]
public static ExpressionStatement when(Expression condition, BlockExpression action)
{
	var func = new BlockExpression();

	var conditionFunc = new Block();
	conditionFunc.Add(new ReturnStatement(condition));

	func .Body.Add(
		new BinaryExpression(
			BinaryOperatorType.Assign,
			new ReferenceExpression("Condition"),
			new BlockExpression(conditionFunc)
			)
		);

	Expression serialize = new CodeSerializer().Serialize(condition);
	RuleBuilder.Revisit(serialize);

	func .Body.Add(
		new BinaryExpression(
			BinaryOperatorType.Assign,
			new ReferenceExpression("ConditionExpression"),
			serialize
			)
		);

	func .Body.Add(
		new BinaryExpression(
			BinaryOperatorType.Assign,
			new ReferenceExpression("Action"),
			action
			)
		);

	return new MethodInvocationExpression(func);
}

This trick lets you use the meta method to return several statements, which allows to do several property assignments (something that you generally cannot do in a single expression). I'll go over the actual meaning of the code (rather than the mechanics) in a future post.

Inline Anonymous Visitors

One of the most common chores when working with compilers is the need to create special visitors. I mean, I just need to get a list of all the variables in the code, but I need to create a visitor class and execute it in order to get the information out. This is not hard, the code is something like this:

public class ReferenceVisitor : DepthFirstVisitor
{
     public List<string> References = new List<string>();
 
     public void OnReferenceExpression(ReferenceExpression re)
     {
            References.Add(re.Name); 
     }
}
public bool IsCallingEmployeeProperty(Expression condition)
{ 
    var visitor = new ReferenceVisitor();
    visitor.Visit(condition);
    return visitor.References.Contains("Employee"); 
}

Doing this is just annoying. Especially when you have to create several of those, and they make no sense outside of their call site. In many ways, they are to compilers what event handlers are to UI components.

What would happen if we could create a special visitor inline, without going through the "create whole new type" crap? I think that this would be as valuable as anonymous delegates and lambdas turned out to be. With that in mind, let us see if I can make this work, shall we?

public bool IsCallingEmployeeProperty(Expression condition)
{
	var references = new List<string>();
	new InlineVisitor
	{
		OnRefefenceExpression = re => references.Add(re.Name) 
	}.Visit(condition);
	return references.Contains("Employee"); 
}

Especially in the more moderately complex scenarios, such a thing is extremely useful.

The implications of Google Chrome

image  So, Google is coming out with a new browsers, while at the same time they are also responsible for a large part of both FireFox and Opera's budgets.

I think that this is a very interesting development. In particular, because Google thinks about the browser as a complementary offer to what it does, or as a baseline platform, not as the actual end result.

This get really interesting when you think that in this scenario, Google can leverage what are effectively Killer Applications in order to migrate people from one browser to another. If YouTube worked better with Chrome than with IE, I think it would be a very powerful motivator to move. And Google has dozens of such high value assets.

I am pretty sure that Google will produce plugins for anything new it creates, so other browsers can work with it as well (to do otherwise is to risk monopoly charges), but if used properly, it will allow Google to leverage its own power to produce its own de facto standards, which browsers will have to follow.

The focus on creating a browser which is focused primarily on allowing application development and hosting (vs. mere browsing) is likely to aid in setting the new baseline standard of what a browser platform should provide for the web applications that are hosted in it.

From my point of view, I think that this is going to allow Google to take a much more active role in shaping the environment in which their applications are living. From that perspective, it seems like a very natural step for them.

Review: Hibernate Search in Action

image I just finished reading Hibernate Search in Action, and I loved it. I should point out that I was the porter of Hibernate Search to NHibernate Search, so I had some previous expertise in the topic. In addition to that, I approached this book at an angle completely orthogonal to the expected audience. Unlike most "in Action" books, I did not intend to make immediate use of the code and approaches suggested in the book. Instead, I looked to the book as a way to deepen my understanding of the tool and how it works.

I am impressed, massively so, that it did so well in this regard for someone who has gone through the entire source code of the project several times.

I'll not bore you with the actual details, you can get the actual content summary of off the site. From my perspective, after reading this book I know that I am going to take a completely different approach for most complex search scenarios, and I think that I have the practical theoretical knowledge to deal with it.

I highly recommend the book if you actually need to deal with Hibernate Search, but I would recommend it to people who are not using it, because it contains some important eye opening concepts if you are not used to full text search tools capabilities. As a nice bonus, I was able to take the information in the book and use it to discuss a problem the customer was having, ending up with something that I consider far superior of the solution that they currently employ.

It is not out yet, and I reviewed a non final copy, but you can order the PDF right now, and just reading the freely available first chapter is valuable in itself.