Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,546
|
Comments: 51,161
Privacy Policy · Terms
filter by tags archive
time to read 3 min | 537 words

I was having dinner with Dru Sellers and Evan Hoff and Dru brought something up that really sparked my imagination. To put it in more concrete words, what Dru said started a train of thought that ended up with another mandatory requirement for any non trivial application.

Your application should have a blog.

Now, pay attention. I am not saying that the application team should have a blog, I am saying that the application should have one. What do I mean by that? As part of the deployment requirements for the application, we are going to setup a blog application that is an integrated component of the application.

Huh? I am building Order Management application, why the hell do I need to have a blog as part of the application? Yes, PR is important, and blogs can get good PR, but what are you talking about?

This is an internal blog, visible only for the internal users, and into it the application is going to blog about interesting events that happened. For example, starting up the application would also cause it to post to its blog about that, which can look like this:

image

This looks like log messages that were written by a PR guy. What is the whole point here? Isn't this just standard logging?

Not quite. This is an expansion of the idea of system alert, where the system can proactively determine and warn about various conditions. This idea is anything but new, you are probably familiar with the term the Operation Database. But this approach has one key factor that is different.

Social Engineering

Using a blog, and using this style of writing, making it extremely clear what should and should not go there as a message. You obviously are not going to want to treat this as a standard log, where you just dump stuff in. From the point of view of actually getting this through, this make a task that is often very hard into a very simple "show by example".

From the point of view of the system as a whole, now business users have a way to watch what the system is doing, check on status updates, etc. More than that, you can now use this as a way to post reports (weekly summary, for example) and in general vastly increase the visibility of the system.

Using RSS allows syndication, which in turn also also easy monitoring by an admin, without any real complexity getting in the way. For that matter, you can get the business user to subscribe to it with Outlook (if they don't already have a standard RSS reader) and get them on board as well.

Now, this is explicitly not a place where you want to put technical details. This should be reserved to some other system, this is a high level overview on what the system is doing. Posts are built to be human readable and human sounding, to avoid boring the readers and to ensure that people actually use this.

Thoughts?

time to read 1 min | 140 words

I am currently writing a DSL that is used to meta program another DSL that is used to to do some action (it is actually turtles 7 layers deep, but we will skip that). It gets to be fairly interesting, although trying to draw that as a diagram is... a bit challenging.

Oh, and there are at least a few parts that rewrite itself.

Ever tried to do incremental method munging? That is when you take code from several places and start applying logic to where to put it. Only useful because of a lot of interesting constraints that we have to deal with in this project, and probably will be actively harmful on other scenarios. And that is only one technique that I am using there.

But a damn elegant approach to solve a problem, wow!

time to read 1 min | 130 words

I am giving a lot of thought to this chapter, because I want to be able to throw out as much best & worst practices as I can to the reader. Here is what I have right now:

  1. Auditable DSL - Dealing with the large scale - what the hell is going on?
  2. User extensible languages
  3. Multi lingual languages
  4. Multi file languages
  5. Data as a first class concept
  6. Code == Data == Code
  7. Strategies for editing DSL in production
  8. Code data mining
  9. DSL dialects

I am still looking for the tenth piece...

time to read 1 min | 158 words

Contrary to popular opinion, I have not been kidnapped, nor have I been hit on the head, nor have I started to seek that kind of job security.

I gave a talk about legacy code and refactoring, and I needed something concrete to talk about. Unfortunately, most legacy code is too intertwined to be able to extract out in order to talk about it in isolation. So I set out to write my own.

To all the people who assume that I don't live in the real world, or that I don't deal with legacy systems... well, I think that this code shows that I do know what is going out there.

And just to make it clear, no, I wouldn't write this type of code for any reason. Not for a spike or for a lark. But I think that this is a pretty good archeological fake, even if I say so myself. 

DSL Dialects

time to read 1 min | 168 words

Let us take this fancy DSL:

image

And let us say that we want to give the user some sort of UI that shows how this DSL works. The implementation of this DSL isn't really friendly for the UI. It was built for execution, not for display.

So how are we going to solve the problem?  There are a couple of ways of doing that, but the easiest solution that I know of consists of creating a new language implementation that is focused on providing an easy to build UI. A dialect can be either a different language (or version of the language) that maps to the same backend engine, or it can be a different engine that is mapped to the same language.

This is part of the reason that it is so important to create strict separation between the two.

time to read 16 min | 3178 words

I am just going to post that, and watch what happens. I will note that this is code that I just wrote, from scratch.

public class TaxCalculator
{
    private string conStr;
    private DataSet rates;

    public TaxCalculator(string conStr)
    {
        this.conStr = conStr;
        using (SqlConnection con = new SqlConnection(conStr))
        {
            con.Open();
            using (SqlCommand cmd = new SqlCommand("SELECT * FROM tblTxRtes", con))
            {
                rates = new DataSet();
                new SqlDataAdapter(cmd).Fill(rates);
                Log.Write("Read " + rates.Tables[0].Rows.Count + " rates from database");
                if (rates.Tables[0].Rows.Count == 0)
                {
                    MailMessage msg = new MailMessage("important@legacy.org", "joe@legacy.com");
                    msg.Subject = "NO RATES IN DATABASE!!!!!";
                    msg.Priority = MailPriority.High;
                    new SmtpClient("mail.legacy.com", 9089).Send(msg);
                    Log.Write("No rates for taxes found in " + conStr);
                    throw new ApplicationException("No rates, Joe forgot to load the rates AGAIN!");
                }
            }
        }
    }

    public bool Process(XmlDocument transaction)
    {
        try
        {
            Hashtable tx2tot = new Hashtable();
            foreach (XmlNode o in transaction.FirstChild.ChildNodes)
            {
            restart:
                if (o.Attributes["type"].Value == "2")
                {
                    Log.Write("Type two transaction processing");
                    decimal total = decimal.Parse(o.Attributes["tot"].Value);
                    XmlAttribute attribute = transaction.CreateAttribute("tax");
                    decimal r = -1;
                    foreach (DataRow dataRow in rates.Tables[0].Rows)
                    {
                        if ((string)dataRow[2] == o.SelectSingleNode("//cust-details/state").Value)
                        {
                            r = decimal.Parse(dataRow[2].ToString());
                        }
                    }
                    Log.Write("Rate calculated and is: " + r);
                    o.Attributes.Append(attribute);
                    if (r == -1)
                    {
                        MailMessage msg = new MailMessage("important@legacy.org", "joe@legacy.com");
                        msg.Subject = "NO RATES FOR " + o.SelectSingleNode("//cust-details/state").Value + " TRANSACTION !!!!ABORTED!!!!";
                        msg.Priority = MailPriority.High;
                        new SmtpClient("mail.legacy.com", 9089).Send(msg);
                        Log.Write("No rate for transaction in tranasction state");
                        throw new ApplicationException("No rates, Joe forgot to load the rates AGAIN!");
                    }
                    tx2tot.Add(o.Attributes["id"], total * r);
                    attribute.Value = (total * r).ToString();
                }
                else if (o.Attributes["type"].Value == "1")
                {
                    //2006-05-02 just need to do the calc
                    decimal total = 0;
                    foreach (XmlNode i in o.ChildNodes)
                    {
                        total += ProductPriceByNode(i);
                    }
                    try
                    {
                        // 2007-02-19 not so simple, TX has different rule
                        if (o.SelectSingleNode("//cust-details/state").Value == "TX")
                        {
                            total *= (decimal)1.02;
                        }
                    }
                    catch (NullReferenceException)
                    {
                        XmlElement element = transaction.CreateElement("state");
                        element.Value = "NJ";
                        o.SelectSingleNode("//cust-details").AppendChild(element);
                    }
                    XmlAttribute attribute = transaction.CreateAttribute("tax");
                    decimal r = -1;
                    foreach (DataRow dataRow in rates.Tables[0].Rows)
                    {
                        if ((string)dataRow[2] == o.SelectSingleNode("//cust-details/state").Value)
                        {
                            r = decimal.Parse(dataRow[2].ToString());
                        }
                    }
                    if (r == -1)
                    {
                        MailMessage msg = new MailMessage("important@legacy.org", "joe@legacy.com");
                        msg.Subject = "NO RATES FOR " + o.SelectSingleNode("//cust-details/state").Value + " TRANSACTION !!!!ABORTED!!!!";
                        msg.Priority = MailPriority.High;
                        new SmtpClient("mail.legacy.com", 9089).Send(msg);
                        throw new ApplicationException("No rates, Joe forgot to load the rates AGAIN!");
                    }
                    attribute.Value = (total * r).ToString();
                    tx2tot.Add(o.Attributes["id"], total * r);
                    o.Attributes.Append(attribute);
                }
                else if (o.Attributes["type"].Value == "@")
                {
                    o.Attributes["type"].Value = "2";
                    goto restart;
                    // 2007-04-30 some bastard from northwind made a mistake and they have 3 months release cycle, so we have to
                    // fix this because they won't until sep-07
                }
                else
                {
                    throw new Exception("UNKNOWN TX TYPE");
                }
            }
            SqlConnection con2 = new SqlConnection(conStr);
            SqlCommand cmd2 = new SqlCommand();
            cmd2.Connection = con2;
            con2.Open();
            foreach (DictionaryEntry d in tx2tot)
            {
                cmd2.CommandText = "usp_TrackTxNew";
                cmd2.Parameters.Add("cid", transaction.SelectSingleNode("//cust-details/@id").Value);
                cmd2.Parameters.Add("tx", d.Key);
                cmd2.Parameters.Add("tot", d.Value);
                cmd2.ExecuteNonQuery();
            }
            con2.Close();
        }
        catch (Exception e)
        {
            if (e.Message == "UNKNOWN TX TYPE")
            {
                return false;
            }
            throw e;
        }
        return true;
    }

    private decimal ProductPriceByNode(XmlNode item)
    {
        using (SqlConnection con = new SqlConnection(conStr))
        {
            con.Open();
            using (SqlCommand cmd = new SqlCommand("SELECT * FROM tblProducts WHERE pid=" + item.Attributes["id"], con))
            {
                DataSet set = new DataSet();
                new SqlDataAdapter(cmd).Fill(set);
                return (decimal)set.Tables[0].Rows[0][4];

            }
        }
    }
}
time to read 3 min | 443 words

imageHere is an interesting problem that I run into. I needed to produce an XML document for an external system to consume. This is a fairly complex document format, and there are a lot of scenarios to support. I began to test drive the creation of the XML document, but it turn out that I kept having to make changes as I run into more scenarios that invalidated previous assumptions that I made.

Now, we are talking about a very short iteration cycle, I might write a test to validate an assumption (attempting to put two items in the same container should throws) and an hour later realize that it is a legal, if strange, behavior. The tests became a pain point, I had to keep updating things because the invariant that they were based upon were wrong.

At that point, I decided that TDD was exactly the wrong approach for this scenario. Therefor, I decided that I am going to fall back to the old "trial and error" method. In this case, producing the XML and comparing using a diff tool.

The friction in the process went down significantly, because I didn't have to go and fix the tests all the time. I did break things that used to work, but I caught them mostly with manual diff checks.

So far, not a really interesting story. What is interesting is what happens when I decided that I have done enough work to consider most scenarios to be completed. I took all the scenarios and started generating tests for those. So for each scenario I now have a test that tests the current behavior of the system. This is blind testing. That is, I assume that the system is working correctly, and I want to ensure that it keeps working in this way. I am not sure what each test is doing, but the current behavior is assumed to be correct until proven otherwise..

Now I am back to having my usual safety net, and it is a lot of fun to go from zero tests to nearly five hundred tests in a few minutes.

This doesn't prove that the behavior of the system is correct, but it does ensure no regression and make sure that we have a stable platform to work from. We might find a bug, but then we can fix it in safety.

I don't recommend this approach for general use, but for this case, it has proven to be very useful.

Code Data Mining

time to read 2 min | 298 words

I just wrote this piece of code:

class ExpressionInserterVisitor : DepthFirstVisitor
{
    public override bool Visit(Node node)
    {
        using(var con = new SqlConnection("data source=localhost;Initial Catalog=Test;Trusted_Connection=yes"))
        using (var command = con.CreateCommand())
        {
            con.Open();
            command.CommandText = "INSERT INTO Expressions (Expression) VALUES(@P1)";
            command.Parameters.AddWithValue("@P1", node.ToString());
            command.ExecuteNonQuery();
        }
        Console.WriteLine(node);
        return base.Visit(node);
    }
}

As you can imagine, this is disposable code, but why did I write that?

I run this code on the entire DSL code base that I have, and then started applying metrics to it. In particular, I was interested in trying to find repeated concepts that has not been codified.

For example, if this would have shown 7 uses of:

user.IsPreferred and order.Total > 500 and (order.PaymentMethod is Cash or not user.IsHighRisk)

Then this is a good indication that I have a business concept waiting to be discovered here, and I turn that into a part of my language:

IsGoodDealForVendor (or something like that)

Here we aren't interested in the usual code quality metrics, we are interested in business quality metrics :-) And the results were, to say the least, impressive.

FUTURE POSTS

  1. Partial writes, IO_Uring and safety - about one day from now
  2. Configuration values & Escape hatches - 5 days from now
  3. What happens when a sparse file allocation fails? - 7 days from now
  4. NTFS has an emergency stash of disk space - 9 days from now
  5. Challenge: Giving file system developer ulcer - 12 days from now

And 4 more posts are pending...

There are posts all the way to Feb 17, 2025

RECENT SERIES

  1. Challenge (77):
    20 Jan 2025 - What does this code do?
  2. Answer (13):
    22 Jan 2025 - What does this code do?
  3. Production post-mortem (2):
    17 Jan 2025 - Inspecting ourselves to death
  4. Performance discovery (2):
    10 Jan 2025 - IOPS vs. IOPS
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}