Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,573
|
Comments: 51,188
Privacy Policy · Terms
filter by tags archive
time to read 2 min | 291 words

One of the most common chores when working with compilers is the need to create special visitors. I mean, I just need to get a list of all the variables in the code, but I need to create a visitor class and execute it in order to get the information out. This is not hard, the code is something like this:

public class ReferenceVisitor : DepthFirstVisitor
{
     public List<string> References = new List<string>();
 
     public void OnReferenceExpression(ReferenceExpression re)
     {
            References.Add(re.Name); 
     }
}
public bool IsCallingEmployeeProperty(Expression condition)
{ 
    var visitor = new ReferenceVisitor();
    visitor.Visit(condition);
    return visitor.References.Contains("Employee"); 
}

Doing this is just annoying. Especially when you have to create several of those, and they make no sense outside of their call site. In many ways, they are to compilers what event handlers are to UI components.

What would happen if we could create a special visitor inline, without going through the "create whole new type" crap? I think that this would be as valuable as anonymous delegates and lambdas turned out to be. With that in mind, let us see if I can make this work, shall we?

public bool IsCallingEmployeeProperty(Expression condition)
{
	var references = new List<string>();
	new InlineVisitor
	{
		OnRefefenceExpression = re => references.Add(re.Name) 
	}.Visit(condition);
	return references.Contains("Employee"); 
}

Especially in the more moderately complex scenarios, such a thing is extremely useful.

time to read 1 min | 130 words

I am giving a lot of thought to this chapter, because I want to be able to throw out as much best & worst practices as I can to the reader. Here is what I have right now:

  1. Auditable DSL - Dealing with the large scale - what the hell is going on?
  2. User extensible languages
  3. Multi lingual languages
  4. Multi file languages
  5. Data as a first class concept
  6. Code == Data == Code
  7. Strategies for editing DSL in production
  8. Code data mining
  9. DSL dialects

I am still looking for the tenth piece...

DSL Dialects

time to read 1 min | 168 words

Let us take this fancy DSL:

image

And let us say that we want to give the user some sort of UI that shows how this DSL works. The implementation of this DSL isn't really friendly for the UI. It was built for execution, not for display.

So how are we going to solve the problem?  There are a couple of ways of doing that, but the easiest solution that I know of consists of creating a new language implementation that is focused on providing an easy to build UI. A dialect can be either a different language (or version of the language) that maps to the same backend engine, or it can be a different engine that is mapped to the same language.

This is part of the reason that it is so important to create strict separation between the two.

Code Data Mining

time to read 2 min | 298 words

I just wrote this piece of code:

class ExpressionInserterVisitor : DepthFirstVisitor
{
    public override bool Visit(Node node)
    {
        using(var con = new SqlConnection("data source=localhost;Initial Catalog=Test;Trusted_Connection=yes"))
        using (var command = con.CreateCommand())
        {
            con.Open();
            command.CommandText = "INSERT INTO Expressions (Expression) VALUES(@P1)";
            command.Parameters.AddWithValue("@P1", node.ToString());
            command.ExecuteNonQuery();
        }
        Console.WriteLine(node);
        return base.Visit(node);
    }
}

As you can imagine, this is disposable code, but why did I write that?

I run this code on the entire DSL code base that I have, and then started applying metrics to it. In particular, I was interested in trying to find repeated concepts that has not been codified.

For example, if this would have shown 7 uses of:

user.IsPreferred and order.Total > 500 and (order.PaymentMethod is Cash or not user.IsHighRisk)

Then this is a good indication that I have a business concept waiting to be discovered here, and I turn that into a part of my language:

IsGoodDealForVendor (or something like that)

Here we aren't interested in the usual code quality metrics, we are interested in business quality metrics :-) And the results were, to say the least, impressive.

time to read 5 min | 853 words

The last time we looked at this issue, we built all the pieces that were required, except for the most important one, actually handling the contextual menu. I am going to be as open as possible, Intellisense is not a trivial task. Nevertheless, we can get pretty good results without investing too much time if we want to. As a reminder, here is the method that is actually responsible for the magic that is about to happen:

public ICompletionData[] GenerateCompletionData(string fileName, TextArea textArea, char charTyped)
{
        return new ICompletionData[] {
             new DefaultCompletionData("Text", "Description", 0),
             new DefaultCompletionData("Text2", "Description2", 1)
        };
}

Not terribly impressive yet, I know, but let us see what we can figure out now. First, we need to find what is the current expression that the caret is located on. That will give us the information that we need to make a decision. We could try to parse the text ourselves, or use the existing Boo Parser. However, the Boo Parser isn't really suitable for the kind of precise UI work that we need here. There are various incompatibilities along the way ( from the way it handles tabs to the nesting of expressions ). None of them is a blocker, and using the Boo Parser is likely the way you want for the more advance scenarios.

Reusing the #Develop parser gives us all the information we need, and we don't need to define things twice. Because we are going to work on a simple language, this is actually the simplest solution. Let us see what is involved in this.

public ICompletionData[] GenerateCompletionData(string fileName, TextArea textArea, char charTyped)
{
    TextWord prevNonWhitespaceTerm = FindPreviousMethod(textArea);
    if(prevNonWhitespaceTerm==null)
        return EmptySuggestion(textArea.Caret);

    var name = prevNonWhitespaceTerm.Word;
    if (name == "specification" || name == "requires" || name == "same_machine_as" || name == "@")
    {
        return ModulesSuggestions();
    }
    int temp;
    if (name == "users_per_machine" || int.TryParse(name, out temp))
    {
        return NumbersSuggestions();
    }
    return EmptySuggestion(textArea.Caret);
}

private TextWord FindPreviousMethod(TextArea textArea)
{
    var lineSegment = textArea.Document.GetLineSegment(textArea.Caret.Line);
    var currentWord = lineSegment.GetWord(textArea.Caret.Column);
    if (currentWord == null && lineSegment.Words.Count > 1)
        currentWord = lineSegment.Words[lineSegment.Words.Count - 1];
    // we actually want the previous word, not the current one, in order to make decisions on it.
    var currentIndex = lineSegment.Words.IndexOf(currentWord);
    if (currentIndex == -1)
        return null;

    return lineSegment.Words.GetRange(0, currentIndex).FindLast(word => word.Word.Trim() != "") ;
}

Again, allow me to reiterate that this is a fairly primitive solution, but it is a good one for our current needs. I am not going to go over all the suggestion methods, but here is the ModulesSuggestion method, which is responsible for the screenshot below:

private ICompletionData[] ModulesSuggestions()
{
    return new ICompletionData[]
    {
        new DefaultCompletionData("@vacations", null, 2),
        new DefaultCompletionData("@external_connections", null, 2),
        new DefaultCompletionData("@salary", null, 2),
        new DefaultCompletionData("@pension", null, 2),
        new DefaultCompletionData("@scheduling_work", null, 2),
        new DefaultCompletionData("@health_insurance", null, 2),
        new DefaultCompletionData("@taxes", null, 2),
    };
}

And this is how it looks like.

image

It works, it is simple, and it doesn't take too much time to build. If we want to get more than this, we probably need to start utilizing the boo parser directly, which will give us a lot more context than the text tokenizer that #Develop is using for syntax highlighter. Nevertheless, I think this is good work.

Code or data?

time to read 3 min | 581 words

Here is a question that came up in the book's forums:

I can't figure out how to get the text of the expression from expression.ToCodeString() or better yet, the actual text from the .boo file.

It appears to automagically convert from type Expression to a delegate. What I want is to be able to when a condition is evaluated display the condition that was evaluated, so if when 1 < 5 was evaluated I would be able to get the string "when 1 < 5" - Any way to do this?

Let us see what the issue is. Given this code:

when order.Amount > 10:
	print "yeah!"

We want to see the following printed:

Because 'order.Amount > 10' evaluated to true, executing rule action.
yeah!

The problem, of course, is how exactly to get the string that represent the rule. It is actually simple to do, we just need to ask the compiler nicely, like this:

public abstract class OrderRule
{
    public Predicate<Order> Condition { get; set; }
    public string ConditionString { get; set; }
    public Action RuleAction { get; set; }
    protected Order order;
    public abstract void Build();

    [Meta]
    public static Expression when(Expression expression, Expression action)
    {
        var condition = new BlockExpression();
        condition.Body.Add(new ReturnStatement(expression));
        return new MethodInvocationExpression(
            new ReferenceExpression("When"),
            condition,
            new StringLiteralExpression(expression.ToCodeString()),
            action
            );
    }


    public void When(Predicate<Order> condition, string conditionAsString, Action action)
    {
        Condition = condition;
        ConditionString = conditionAsString;
        RuleAction = action;
    }

    public void Evaluate(Order o)
    {
        order = o;
        if (Condition(o) == false)
            return;
        Console.WriteLine("Because '{0}' evaluated to true, running rule action",ConditionString);
        RuleAction();
    }
}

The key here happens in the when() static method. We translate the call to the when keyword to a call to the When() instance method. Along the way, we aren't passing just the arguments that we got, we are also passing a string literal with the code that was extracted from the relevant expression.

time to read 4 min | 620 words

Another interesting question from Chris Ortman:

So I write my dsl, and tell my customer to here edit this text file?
How do I tell them what the possible options are? Intellisense?
This is a web app, and my desire to build intellisense into a javascript rich text editor is very low.
It might be a good excuse to try out silverlight but even then it seems a large task.
Or I put express or #Develop on the server and make that the 'admin' gui?

This is actually a question that comes up often. Yes, we have a DSL and now it is easy to change, how are we going to deal with changes that affect production?

There are actually several layers to this question. First, there is the practical matter of having some sort of a UI to enable this change. As Chris has noted, this is not something that can be trivially produced as part of the admin section. But the UI is only a tiny part of it. This is especially the case if you want to do things directly on production.

There is a whole host of challenges that come up in this scenario (error handling, handling frequent changes, cascading updates, debugging, invasive execution, etc) that needs to be dealt with. In development mode, there is no issue, because we can afford to be unsafe there. For production, that is not an option. Then you have to deal with issues such as providing auditing information, "who did what, why and when". Another important consideration is the ability to safely roll back a change.

As you can imagine, this is not a simple matter.

My approach, usually, is to avoid this requirement as much as possible. That is, I do not allow to do such things on production. Oh, it is still possible, but it is a manual process that is there for emergency use only. Similar to the ability to log in to the production DB and run queries, is should be reserved, rare and avoided if possible.

However, this is not always possible. If the client needs the ability to do edit the DSL scripts on production, then we need to provide a way for them to do so. What I have found to be useful is to not provide a way to work directly on production. No, I am not being a smartass here, I actually have a point. Instead of working directly on the production scripts, we start, as part of the design, to store the scripts in an SVN server, which is part of the application itself.

If you want to access the scripts, you check them out of the SVN server. Now you can edit them with any tool you want, and finish by committing them back to the repository. The application monitors the repository and will update itself when a commit is done to the /production branch.

This has several big advantages. First, we don't have the problem of partial updates, we have a pretty good audit trail and we have built in reversibility. In addition to that, we avoid the whole problem of having to build a UI for editing the scripts on production, we use the same tools that we use during development for that.

As a side benefit, this also means that pushing script changes to a farm is builtin.

And yes, this is basically continuous integration as part of the actual applicatio.

time to read 7 min | 1316 words

In many cases, intellisense is the killer feature that will make all the difference in using a language. However, it is significantly harder than just defining the syntax rules. The main problem is that we need to deal with the current context. Let us take a look at what we would like our intellisense to do for the Quote Generation DSL.

image

  • On empty line, show "specification"
  • On specification parameter, show all available modules.
  • On empty line inside specification block, show all actions (requires, users_per_machine, same_machine_as)
  • On action parameter, find appropriate value (available modules for the requires and same_machine_as actions, pre-specified user counts for the users_per_machine)

And this is for a scenario when we don't even want to deal with intellisense for the CLR API that we can use…

#Develop will give us the facilities, but it can't give us the context, this is something that we need to provide.Let us see how this works, shall we?

First, we need to decide what will invoke the intellisense. In this case, I decided to use the typical ctrl+space, so I added this:

this.editorControl.ActiveTextAreaControl.TextArea.KeyDown+=delegate(object sender, KeyEventArgs e)
{
    if (e.Control == false)
        return;
    if (e.KeyCode != Keys.Space)
        return;
    e.SuppressKeyPress = true;
    ShowIntellisense((char)e.KeyValue);
};

I don't think that this code requires any explanation, so we will move directly to the ShowIntellisense method:

private void ShowIntellisense(char value)
{
    ICompletionDataProvider completionDataProvider = new CodeCompletionProvider(this.imageList1);

    codeCompletionWindow = CodeCompletionWindow.ShowCompletionWindow(
        this,                // The parent window for the completion window
        editorControl, 	     // The text editor to show the window for
        "",	       	     // Filename - will be passed back to the provider
        completionDataProvider,// Provider to get the list of possible completions
        value		     // Key pressed - will be passed to the provider
        );
    if (this.codeCompletionWindow != null)
    {
        // ShowCompletionWindow can return null when the provider returns an empty list
        this.codeCompletionWindow.Closed += CloseCodeCompletionWindow;
    }
}

We aren't doing much here, simply invoking the facilities that #Develop gives us for intellisense. The interesting bit of work all happen in CodeCompletionProvider. There is a lot of boiler plate code there, so we will scan it shortly, and then arrive to the real interesting part:

public class CodeCompletionProvider : ICompletionDataProvider
{
    private ImageList imageList;

    public CodeCompletionProvider(ImageList imageList)
    {
        this.imageList = imageList;
    }

    public ImageList ImageList
    {
        get
        {
            return imageList;
        }
    }

    public string PreSelection
    {
        get
        {
            return null;
        }
    }

    public int DefaultIndex
    {
        get
        {
            return -1;
        }
    }

    public CompletionDataProviderKeyResult ProcessKey(char key)
    {
        if (char.IsLetterOrDigit(key) || key == '_')
        {
            return CompletionDataProviderKeyResult.NormalKey;
        }
        return CompletionDataProviderKeyResult.InsertionKey;
    }

    /// <summary>
    /// Called when entry should be inserted. Forward to the insertion action of the completion data.
    /// </summary>
    public bool InsertAction(ICompletionData data, TextArea textArea, int insertionOffset, char key)
    {
        textArea.Caret.Position = textArea.Document.OffsetToPosition(
            Math.Min(insertionOffset, textArea.Document.TextLength)
            );
        return data.InsertAction(textArea, key);
    }

    public ICompletionData[] GenerateCompletionData(string fileName, TextArea textArea, char charTyped)
    {
        return new ICompletionData[] {
             new DefaultCompletionData("Text", "Description", 0),
             new DefaultCompletionData("Text2", "Description2", 1)
        };
    }
}

The properties should be self explanatory. ProcessKey allows you decide how to handle the current keypress. Here, you get to see only normal keys (send to the actual text control) and insertion (add the current text to the text control). Another is CompletionDataProviderKeyResult.BeforeStartKey, which tells the control to ignore this key. That is important when you want to narrow the selection choice based on what the user it typing.

InsertAction simply instructs the editor in where to place the newly added text. This is important if the user enabled intellisense in the middle of a term, and you want to fix that term.

GenerateCompletionData is where the real interest lies. Everything else is just user experience, a very important detail, but basically just a detail. GenerateComletionData is where the power lies. (note, this is the appropriate Muhahaha! moment).

The current implementation isn't doing much, just returning a hard coded list of values. Note that even this trivial implementation, without any context whatsoever will give you a lot of value. Just because you can now expose more easily the DSL structure. And let us not forget the marketing side of that, if you have intellsense, even if none too intelligent one, you are already way ahead of the game. And here is out result:

image 

I'll go over providing the actual context for the code in another post.

time to read 5 min | 872 words

Just using the Boo syntax isn't really enough in many cases, you want to handle your own custom keywords, behaviors, etc.

#Develop make this a piece of cake, since it defines the syntax highlighting using an XML file, and handles the actual parsing and coloring on its on. Here is the overall structure of such a file:

<?xml version="1.0"?>
<SyntaxDefinition name="Boo" 
                  extensions=".boo">
  <Environment>
    <Default bold="false"
             italic="false"
             color="SystemColors.WindowText"
             bgcolor="SystemColors.Window" />
    <Selection bold="false"
               italic="false"
               color="SystemColors.HighlightText"
               bgcolor="SystemColors.Highlight" />
  </Environment>

  <Digits name="Digits"
          bold="false"
          italic="false"
          color="DarkBlue" />

  <RuleSets>
    <RuleSet ignorecase="false" >
      <Delimiters>&amp;&lt;&gt;~!@$%^*()-+=|\#/{}[]:;"' ,	.?</Delimiters>

      <Span name="LineComment"
            stopateol="true"
            bold="false"
            italic="false"
            color="Gray" >
        <Begin >#</Begin>
      </Span>

      <KeyWords name="JumpStatements"
                bold="false"
                italic="false"
                color="Navy" >
        <Key word="break"/>
        <Key word="continue"/>
        <Key word="return"/>
        <Key word="yield"/>
        <Key word="goto" />
      </KeyWords>

    </RuleSet>
  </RuleSets>
</SyntaxDefinition>

As you can see, this is pretty easy to work with. Now let us add our own keywords:

<KeyWords name="DslKeywords"
          bold="false"
          italic="false"
          color="DarkOrange" >
  <Key word="specification"/>
  <Key word="users_per_machine"/>
  <Key word="requires"/>
  <Key word="same_machine_as"/>
</KeyWords>

Now we need to load the new language definition (don't forget to change the name, I changed it to "dsl") to the editor an select it:

HighlightingManager.Manager.AddSyntaxModeFileProvider(
    new FileSyntaxModeProvider(@"C:\Path\to\language\definition"));
//.. setup text editor
editorControl.SetHighlighting("dsl");

The result?

image

FUTURE POSTS

  1. RavenDB Storage Provider for Orleans - 2 hours from now
  2. Making the costs visible, then fixing them - 2 days from now
  3. Scaling HNSW in RavenDB: Optimizing for inadequate hardware - 4 days from now
  4. Optimizing the cost of clearing a set - 7 days from now

There are posts all the way to May 12, 2025

RECENT SERIES

  1. RavenDB News (2):
    02 May 2025 - May 2025
  2. Recording (15):
    30 Apr 2025 - Practical AI Integration with RavenDB
  3. Production Postmortem (52):
    07 Apr 2025 - The race condition in the interlock
  4. RavenDB (13):
    02 Apr 2025 - .NET Aspire integration
  5. RavenDB 7.1 (6):
    18 Mar 2025 - One IO Ring to rule them all
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}