Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,573
|
Comments: 51,188
Privacy Policy · Terms
filter by tags archive
time to read 2 min | 396 words

Jeremy Miller is looking for a DSL that reads like natural language. My immediate response was that it is not practical, because I assumed he wanted very natural language, which is still not possible to do without extremely high budget. Limiting the problem to just reads like a natural language reduce the problem space significantly.

I am going to have a separate post about how to actually solve such a problem, but for now, I want to talk about the actual requested solution. I think it is 100%solvable with a low cost approach. That is, you can get a DSL that reads like English in under an hour. But I don't think it is valuable.

English is a terrible language to express instructions in. Any natural language is terrible in expressing instructions, just find the nearest army sergeant, they will tell you that.

Let us take a look at a language that actually took this approach:

tell application "Finder"   
set the percent_free to ¬ (((the free space of the startup disk) / (the capacity of the startup disk)) * 100) div 1
end tell
if the percent_free is less than 10 then tell application (path to frontmost application as text)
display dialog "The startup disk has only " & the percent_free & ¬
" percent of its capacity available." & return & return & ¬ "Should this script continue?" with icon 1 end tell
end if

This is apple script. From my point of view, this is horrible. It is unreadable in the extreme. More than that, trying to explain how this language works, or how it handles error is a non trivial task.

This has been my experience any time I actually tried to create a natural language like syntax. It is too complex, and users get annoyed when they can't use real natural language.

From my perspective, getting an expressive DSL does not means that it has to read like an English statement. In fact, it probably shouldn't. Too much noise involved. A structured approach isn't just to help the compiler, but to help the reader.

time to read 5 min | 801 words

The problem as it was stated was of rules that looked like this:

upon bounced_check or refused_credit:
	if customer.TotalPurchases > 10000: # preferred
		ask_authorization_for_more_credit
	else:
		call_the cops 

upon new_order:
	if customer.TotalPurchases > 10000: # preferred
		apply_discount 5.precent 

upon order_shipped:
	send_marketing_stuff unless customer.RequestedNoSpam 

I don't like it, and the reason isn't just that we can introduce IsPreferred.

I don't like it because the abstraction facilities here are poor. We have basically introduced events and business rules, maybe with a sprinkling of a domain model, but nothing really meaningful. Such system will die under their own weight in any situation of significant complexity (in other words, in all real world situations).

Let us consider the problem in reverse, shall we? We have various conditions and actions upon which we can act. But the logic is scattered all over the place, making it hard to read, modify, understand and work with. When such a system compose of the lifeblood of the business, the business usually adapts, and starts to talk in the terms of the system. However, they tend to lose the ability to think about things in way that would be more meaningful.

I listened today to a business person trying to explain some concept that he wanted to make. It took him several tries to explain the business problem because he was focused on the technical one. The system has a corrupting affect on it. I call this the Babel Syndrome, the reverse of DDD's ubiquitous language.

Let us see if we can get a high level of meaning out of the above DSL, shall we? First, we restate our problem, instead of dealing with events and conditions for responding the events, we deal with business responses for scenarios. It doesn't sound like much of a difference, but in actuality, there is a big difference between the two.

The most important of those differences is the change from handling the events to handling a business scenario in a given context. In other words, instead of asking what we should do when a check is bounced, we need to ask a totally different question. "When the customer is preferred, what should the response be for bounced check?"

This is anything but a minor change in the the way we think about the language and how we operate on it. Let us see the DSL script, after which we can discuss how it affects us. These are the contents of the default.boo file:

upon order_shipped:
	send_marketing_stuff unless customer.RequestedNoSpam

upon bounced_check or refused_credit: 
call_the cops

This will be executed for all orders, like before. Now, let us look at preferred_customer.boo, and what concepts it express.

when customer.TotalPurchases > 10000 # preferred

upon new_order:
	apply_discount 5.precents

upon bounced_check or refused_credit:
	ask_authorization_for_more_credit

And now we are getting to see some of the more interesting parts of the difference. We are now talking in terms of a business scenario. When we have a preferred customer, and something happen, how should we respond?

This change is a well known refactoring: conditional to polymorphism. In other words, we just created the strategy pattern with a DSL. The difference here is that the script have an active role in deciding whatever it can deal with the scenario or not (in other words, chain of responsibility, and the pattern I am going to mention).

When we need to handle some business scenario, we are going to execute all the scripts, with the default.boo being the last one to run. If any of the scripts accepted the scenario as valid and has specific action to take, it has the option to do so.

Enough about the implementation, let us go back to the concepts. We can make now talk to the business people in a way that is far more concise and natural. Instead of having to focus on all permutations of a possible event, we can now talking about a specific scenario and how we handle the business event in that context. Not only is this more readable, it is easier by far to actually define such things as what is the meaning of a preferred customer. I can open the DSL and actually read it.

Similar approaches are very useful when you recognize that the code is asking to be given a more explicit shape than just generic rules. Don't let your DSL be whatever you started with. Find and actively extract higher level meanings whenever it is possible.

A deeper examination of this DSL, how to build and use it is likely to compose most of chapter 13, as a real world example of a complex DSL. Who do you think?

Given this approach, how would you design an offer management DSL?

time to read 1 min | 174 words

I was having a discussion today about the way business rules are implemented. And large part of the discussion was focused on trying to get a specific behavior in a specific circumstance. As usual, I am going to use a totally different example, which might not be as brutal in its focus as the real one.

We have a set of business rules that relate to what is going to happen to a customer in certain situations. For example, we might have the following:

upon bounced_check or refused_credit:
	if customer.TotalPurchases > 10000: # preferred
		ask_authorizatin_for_more_credit
	else:
		call_the cops

upon new_order:
	if customer.TotalPurchases > 10000: # preferred
		apply_discount 5.precent
upon order_shipped:
send_marketing_stuff unless customer.RequestedNoSpam

What is this code crying for? Here is a hint, it is not the introduction of IsPreferred, although that would be welcome.

I am interested in hearing what you will have to say in this matter.

And as a total non sequitur, cockroaches at Starbucks, yuck.

time to read 2 min | 317 words

This is a note to myself, because I don't have the time for a proper post. When you are dealing with a DSL that contains more than just a few scripts, you really being to care about compilation times. Even with caching, this can be a problem.

The solution is the same that we have been using for the last three to four decades, don't compile if the source hasn't changed.

The code to make this happen using Rhino DSL is here:

public override CompilerContext Compile(string[] urls)
{
    var outputAssemblyName = OutputAssemblyName(urls);
    if (CanUseCachedVersion(outputAssemblyName, urls))
        return new CompilerContext { GeneratedAssembly = Assembly.Load(File.ReadAllBytes(outputAssemblyName)) };
    return base.Compile(urls);
}

private bool CanUseCachedVersion(string outputAssemblyName, string[] urls)
{
    var asm = new FileInfo(outputAssemblyName);
    if(asm.Exists==false)
        return false;
    foreach (var url in urls)
    {
        if(File.GetLastWriteTime(url) > asm.LastWriteTime)
            return false;
    }
    return true;
}

And in the CustomizeCompiler method:

protected override void CustomizeCompiler(BooCompiler compiler, CompilerPipeline pipeline, string[] urls)
{
    compiler.Parameters.OutputAssembly = OutputAssemblyName(urls);
    // add implicit base class here...
    if (pipeline.Find(typeof(SaveAssembly)) == -1)
        pipeline.Add(new SaveAssembly());
}

It is impossible to overstate how big a difference this can make.

time to read 1 min | 93 words

A few hours ago I finished writing Chapter 10. Looking at the time on the calendar, it didn't take too long, about a month. Looking at it from the point of view of effort involved, I feel about as tired as if it took three years.

Other chapters took longer, but didn't take out quite as much effort. Not sure what this means... but I am happy I am done with it (for a given value of done, I will still need to review and read it at least three dozen times).

A balancing act

time to read 2 min | 251 words

image Probably one of the hardest challenges that I am facing with writing the book is to know what to say and what to leave unsaid.

Phrasing it another way, it is choosing at what level to talk to the reader. On the one hand, I really want the reader to be able to make immediate use of the concepts that I am talking about, which drive me to do more practical demonstrations, code samples and covering more common situations. On the other hand, those take up a lot of room, and they tend to be boring if you don't need exactly what you need right this moment.

High level concepts, open ended possibilities and assuming a bit about the reader knowledge level makes for a book that is much more narrowly focused, and I think that it more valuable. However, it also tend to leave readers unsatisfied, because not everything is explained.

Currently I am writing a UI focused chapter, and to get a good experience from the UI you need to invest a lot of time. Metric tons of it. I am trying to chart the way and show how this can be done, but without getting mired in all the actual minute details.

This is a tough balancing act, and I am not sure if I am succeeding.

time to read 1 min | 81 words

Going back to the theme of visually editing a DSL, we have something like this:

image

The backend to this cutey is pattern recognition with custom UI on top. You have to teach it about each pattern that you use with it, but you can get pretty smart with how it works while using a fairly brute force (and simple) techniques.

Thoughts?

Nullifying Null

time to read 3 min | 597 words

One of the more annoying problems with  building rules that are also code is that you have to deal with code related issues. One of the more common ones is NullReferenceException.

For example, let us say that we have the following rule:

when Order.Amount > 10 and Customer.IsPreferred:
      ApplyDiscount 5.precent 

We also support a mode in which a customer can create an order without actually registering on the site (anonymous checkout).

In this scenario, the Customer property is null. We can rewrite the rule to look like this:

when Order.Amount > 10 and Customer is not null and Customer.IsPreferred:
ApplyDiscount 5.precent

But I think that this is extremely ugly. We can also decide to return a default instance of the customer when it is not there, but here I want to show you another way to handle this. We define the rule as invalid when Customer is not there, so it should not be run. The question is how we can know that.

The dirty way is to do something like this:

var referencesCustomer = File.ReadAllText(ruleName).Contains("Customer");
if(referencesCustomer && Customer == null)
   return;

If you gagged when seeing this code, that is a good sign. Let us solve this properly. First, we want some help from the compiler, so let us inspect the when() meta method that we have seen in the previous post a little closer.

[Meta]
public static ExpressionStatement when(Expression condition, BlockExpression action)
{
	var ctor = new BlockExpression();

	var conditionFunc = new Block();
	conditionFunc.Add(new ReturnStatement(condition));

	ctor.Body.Add(
		new BinaryExpression(
			BinaryOperatorType.Assign,
			new ReferenceExpression("Condition"),
			new BlockExpression(conditionFunc)
			)
		);

	Expression serialize = new CodeSerializer().Serialize(condition);
	Builder.Revisit(serialize);

	ctor.Body.Add(
		new BinaryExpression(
			BinaryOperatorType.Assign,
			new ReferenceExpression("ConditionExpression"),
			serialize
			)
		);

	ctor.Body.Add(
		new BinaryExpression(
			BinaryOperatorType.Assign,
			new ReferenceExpression("Action"),
			action
			)
		);

	return new ExpressionStatement(
		new MethodInvocationExpression(
			ctor
			));
}

We take the cal to the when method and transform it to the following code:

delegate
{
	Condition = () => Order.Amount > 10 && Customer.IsPreferred;
	ConditionExpression = (Expression<Func<bool>>)() => Order.Amount > 10 && Customer.IsPreferred;
	Action = delegate
	{
		// not interesting for this post
	};
}();

I am translating to C# 3.0 here in order to make it easier to grasp the concept. The real code is in Boo, of course, and is more interesting. The most fascinating concept here is the use of CodeSerializer, which will turn the condition that we passed into an AST that we can access at runtime. I tried to simulate that by doing an explicit cast to expression tree, which would give similar result in C#).

Having the AST of the code at runtime, even if we don't want to change it (a totally different concept) is incredibly powerful. In this case, we are going to use this to detect when we are referencing a null property and marking the rule as invalid.

Here is the code:

public void Evaluate()
{
	var references = new List<string>();
	new InlineVisitor
	{
		OnReferenceExpression = r => references.Add(r.Name);
	}.Visit(ConditionExpression);
	if(references.Contains("Customer") && Customer == null)
		return;// rule invalid
	if(Condition())
		Action();
}

This is a very simple example of how you can add smarts to the way that your code behaves. This technique is the foundation for a whole host of options. I am using similar approaches for adaptive rules and for auditable actions. Fun stuff, if I say so myself.

time to read 2 min | 272 words

One of the most common problems with using Boo's Meta Methods is that they can only return expressions. This is not good if what you want to return from the meta method is a set of statements to be executed.

The most common reason for that is to initialize a set of variables. Obviously, you can call a method that will do it for you, but there is a simpler way.

Method invocation is an expression, and anonymous delegate definition is also an expression. What does this tells you? That this is an expression as well:

// the whole thing is a single expression.
delegate (int x)
{
	Console.WriteLine(x);
	Console.WriteLine("another statement");
}(5);

You generally won't use something like that in real code, but when you are working with the AST directly, expressions vs. statements require a wholly different world view.

Anyway, let us see how we can implement this using Boo.

[Meta]
public static ExpressionStatement when(Expression condition, BlockExpression action)
{
	var func = new BlockExpression();

	var conditionFunc = new Block();
	conditionFunc.Add(new ReturnStatement(condition));

	func .Body.Add(
		new BinaryExpression(
			BinaryOperatorType.Assign,
			new ReferenceExpression("Condition"),
			new BlockExpression(conditionFunc)
			)
		);

	Expression serialize = new CodeSerializer().Serialize(condition);
	RuleBuilder.Revisit(serialize);

	func .Body.Add(
		new BinaryExpression(
			BinaryOperatorType.Assign,
			new ReferenceExpression("ConditionExpression"),
			serialize
			)
		);

	func .Body.Add(
		new BinaryExpression(
			BinaryOperatorType.Assign,
			new ReferenceExpression("Action"),
			action
			)
		);

	return new MethodInvocationExpression(func);
}

This trick lets you use the meta method to return several statements, which allows to do several property assignments (something that you generally cannot do in a single expression). I'll go over the actual meaning of the code (rather than the mechanics) in a future post.

FUTURE POSTS

  1. RavenDB Storage Provider for Orleans - 2 hours from now
  2. Making the costs visible, then fixing them - 2 days from now
  3. Scaling HNSW in RavenDB: Optimizing for inadequate hardware - 4 days from now
  4. Optimizing the cost of clearing a set - 7 days from now

There are posts all the way to May 12, 2025

RECENT SERIES

  1. RavenDB News (2):
    02 May 2025 - May 2025
  2. Recording (15):
    30 Apr 2025 - Practical AI Integration with RavenDB
  3. Production Postmortem (52):
    07 Apr 2025 - The race condition in the interlock
  4. RavenDB (13):
    02 Apr 2025 - .NET Aspire integration
  5. RavenDB 7.1 (6):
    18 Mar 2025 - One IO Ring to rule them all
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}